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Abstract 



Subramanian defined the complexity class CC as the set of problems log-space reducible to the comparator circuit 
value problem. He proved that several other problems are complete for CC, including the stable marriage problem, 
O^l . and finding the lexicographically first maximal matching in a bipartite graph. 

(3JQ[ We introduce universal comparator circuits, and as applications we prove alternative characterizations of CC: As 

the set of problems AC many-one reducible to the comparator circuit value problem, and as problems computable 
\ by uniform polynomial-size families of comparator circuits supplied with polynomially many copies of the input and 

■ its negation. We also show that CC is closed under AC 'circuit' reductions (i.e. reductions given by a uniform family of 

AC circuits with oracle gates making queries to other CC problems), and that the corresponding function class FCC is 
closed under composition. 

Subramanian showed that NL Q CC Q P. We provide evidence that CC and NC are incomparable (so that CC is a 
proper subset of P), by giving oracle settings where relativized CC and relativized NC are incomparable. We also give 
evidence that CC and SC are incomparable. 

Other results include a simpler proof of NL c CC, a more careful analysis showing the lexicographically first maxi- 
mal matching problem and its variants are CC-complete under AC many-one reductions, and an explanation of the 
relation between the Gale-Shapley algorithm and Subramanian's algorithm for stable marriage. 

The paper continues the previous work of Cook, Le and Ye which focused on Cook-Nguyen style uniform proof 
complexity, answering several open questions raised in that paper. 

(N 

(N ; 1 Introduction 

00 

Comparator networks were originally introduced as a method of sorting numbers (as in Batcher's even-odd 
merge sort (5]), but they are still interesting when the numbers are restricted to the Boolean values {0,1} (in 
fact, a sorting network made from comparators is valid if and only if it works on Boolean inputs). A comparator 
gate has two inputs p, q and two outputs p' , q' , where p' = min{p, q] and q' = max{p, q}. In the Boolean case 
(which is the one we consider) p' - p i\q and q' — pv q. A comparator circuit (i.e. network) is presented as a 
set of m horizontal lines in which the m inputs are presented at the left ends of the lines and the m outputs are 
presented at the right ends of the lines, and in between there is a sequence of comparator gates, each represented 
as a vertical arrow connecting some wire u>i with some wire Wj as shown in Fig.[T] These arrows divide each wire 
into segments, each of which gets a Boolean value. The values of wires Wi and Wj after the arrow are the com- 
parator outputs of the values of wires Wi and wj right before the arrow, with the tip of the arrow representing the 
maximum. 

The comparator circuit value problem (Ccv) is: given a comparator 



circuit with specified Boolean inputs, determine the output value of a 1 w °~ 

1 un- 
designated wire. To turn this into a complexity class it seems natural . W2 

to use a reducibility notion that is weak but robust. Thus we define o W3- 

CC to consist of those problems (uniform) AC many-one-reducible to W4 ~ 

Ccv. (Subramanian 1 17 studied the complexity of Ccv using log-space 

reducibility, but fortunately our class CC is closed under log-space re- Figure 1 

ducibility as well as AC -reducibility.) From 1 17 1 we have 
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NLcCCcP, 



(1.1) 
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where NL is nondeterministic log space. The last inclusion is obvious because Ccv is a special case of the mono- 
tone circuit value problem, which is clearly in P. However comparator circuits are more restricted than mono- 
tone Boolean circuits, because each comparator gate output has fan-out one. We conjecture that CC C P, and 
further we conjecture that CC and NC are incomparable. Here NC is the class of problems computed by uniform 
circuit families of polynomial size and polylog depth; intuitively NC is polylog parallel time. Similar to ll.li we 
have 

NLcNCcP. 

In Section[5]we prove results supporting our conjecture that CC is incomparable with NC. 

The complexity class CC has a number of disparate complete problems, including the comparator circuit 
value problem (Ccv), the lexicographically first maximal matching problem (Lfmm), and the stable marriage 
problem (Sm) (T2][T7l . (The first author outlined a proof that Lfmm is complete under NC 1 -reductions in unpub- 
lished notes from 1983.) The Sm problem is especially interesting: introduced by Gale and Shapley in 1962 |8|, 
it has since been used to pair medical interns with hospital residencies in the USA. Sm can be stated as follows: 
Given n men and n women, each with a complete ranking according to preference of all n members of the op- 
posite sex, find a complete matching of the men and women such that there are no two people of opposite sex 
who would both rather have each other than their current partners. Gale and Shapley proved that such a 'stable' 
matching always exists, although it may not be unique. 

Other interesting CC-complete problems include the stable roommate problem 1 17 1, the telephone connec- 
tion problem |15|, the problem of predicting internal diffusion-limited aggregation clusters from theoretical 
physics [13] , and the decision version of the hierarchical clustering problem |9 |. We refer to Q~7] for other CC- 
complete problems. 

The present paper continues the research initiated in [TT][5] , in which two of the present authors participated. 
The former work introduced a formal theory VCC which captures the complexity class CC (in the style of Cook 
and Nguyen 0]), and showed that Subramanian's results are formalizable in the theory. On the way some of the 
proofs were simplified. In the present paper we resolve some important complexity- theoretic questions left open 
in Hi]: 

We introduce a notion of relativized CC and prove that it is incomparable with relativized NC (Section[5). 
We show that CC is closed under AC 'circuit' reductions (Section[3), thus identifying the three complexity 
classes defined in (TT| . 

■ We clarify the connection between the Gale-Shapley algorithm for stable marriage and Subramanian's algo- 
rithm (Section[7). 

Concerning the first item above, NC is often characterized informally as the class of problems which can be 
solved rapidly in parallel (i.e. in polylog time) using polynomially many processors. The lexicographically first 
maximal matching problem is in CC, but the obvious algorithm for solving it is sequential (accumulate a match- 
ing by successively adding the first unmatched edge which doesn't touch any edge in the current matching), and 
we do not know of any parallel algorithm for this which approaches polylog time. On the other hand, the problem 
of raising an n x n integer matrix to the nth power is in NC 2 , but we do not know how to solve it using polynomial 
size comparator circuits, even if -i gates are allowed. 

The apparent reason that comparator circuits are limited in their computing power compared to Boolean 
circuits is their limited fan-out: each comparator gate has two inputs and just two outputs (A and v). Further, if 
either input value to a comparator gate is 'flipped' (i.e. changed from to 1 or from 1 to 0) then exactly one of its 
outputs is flipped. This generalizes to the whole circuit: If the input to any wire in a comparator circuit (possibly 
with -i -gates) is flipped then at every layer in the circuit, including the output, exactly one wire is flipped. Thus 
flipping an input generates a unique flip-path through the circuit from one input wire to one output wire. 

Proving either half of the conjecture (NC $£ CC or CC £ NC) would require a major breakthrough in complexity 
theory, so instead we prove a suitable relativized version of the conjecture. For this we allow oracle comparator 
circuits to have oracle gates, as well as comparator gates and -i gates. 
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Paper organization 

In Section[2]we define the basic concepts, including CC and its complete problems. 

In Section [3] we introduce the notion of a universal comparator circuit. As applications we prove several 
results showing the robustness of the class CC. We show that the the class FCC of functions associated with 
CC is closed under composition, and hence CC is closed under AC 'circuit' reductions (also called simply AC 
reductions (2][4]); these reductions are given by a uniform family of AC circuits with oracle gates making queries 
to other CC problems. We also characterize CC in the style of other circuit classes such as NC and AC: a problem 
is in CC if and only if it is computed by some uniform polynomial size family of comparator circuits (where the 
circuit inputs are allowed repeated copies of the problem input bits and their negations). 

In Section |4] we give a technical overview of the remaining sections of the paper in order to present these 
within the first ten pages. 

In SectionGDwe define a notion of relativized CC and prove that relativized CC is incomparable with relativized 
NC. This of course implies that relativized CC is strictly contained in relativized P. We also argue that CC and SC 
might be incomparable. 

In Section |6] we prove that the lexicographically first maximal matching problem and its variants are com- 
plete for CC under AC many-one reductions. We also claimed a proof that the lexicographically first maximal 
matching problem is complete for CC under AC many-one reductions in [TT|, but there was a gap in that proof. 

In Section[7]we show that the stable marriage problem is complete for CC, using Subramanian's algorithm (T6j 
IT7l . We show that Subramanian's fixed-point algorithm, which uses three- valued logic, is related to the Gale- 
Shapley algorithm via an intermediate interval algorithm. The latter algorithm also explains the provenance of 
three-valued logic: an interval partitions a person's preference list into three parts, so we use three values {0, * , 1} 
to encode these three parts of a preference list. 

In Appendix ?? we include a simple proof that CC contains NL. 

2 Preliminaries 

2.1 Notation 

We use lower case letters, e.g. x, y, z, to denote unary arguments and upper case letters, e.g. X, Y, Z, to 
denote binary string arguments. For a binary string X, we write |X| to denote the length of X. 

2.2 Function classes and search problems 

A complexity class consists of relations R{X), where X is a binary string argument. Given a class of relations C, 
we associate a class FC of functions F(X) with C as follows. We require these functions to be p-bounded, i.e., 
I F(X) | is bounded by a polynomial in | X\ . Then we define FC to consist of all p-bounded string functions whose 
bit graphs are in C. (Here the bit graph of F{X) is the relation Bp[i,X) which holds iff the «th bit of F(X) is 1.) 

Most of the computational problems we consider here can be nicely expressed as decision problems (i.e. 
relations), but the stable marriage problem is an exception, because in general a given instance has more than 
one solution (i.e. there is more than one stable marriage). Thus the problem is properly described as a search 
problem. A search problem Qr is a multivalued function with graph R{X, Z), so Qr{X) = {Z \ R(X, Z)}. 

The search problem is totalii the set Qr{X) is non-empty for all X. The search problem is a function problem 
if \Qr(X)\ = 1 for all X. A function F{X) solves Qr if F(X) e Q R {X) for all X. We will be concerned only with total 
search problems in this paper. 

2.3 Reductions 

Let C be a complexity class. A relation R\(X) is C many-one reducible to a relation Rz{Y) (written R\ R2) if 
there is a function F in FC such that Ri {X) <- R 2 [F{X)). 
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A search problem Q Rl (X) is C many- one reducible to a search problem Qr 2 {Y) if there are functions G, F in 
FC such that G{X, Z) e Q Rl {X) for all Z e Q R . 2 (F(JQ). 

Here we are mainly interested in the cases that C is either AC or L (log space). We also need a generalization 
of AC many-one reducibility called simply AC reducibility (see |2|) and denoted by < AC . A function or relation 
is AC reducible to a collection if of functions and relations if it can be computed by a uniform polynomial size 
constant depth family of circuits which have unbounded fan-in gates computing functions and relations from 
if (i.e. 'oracle gates'), in addition to Boolean gates. 

We note that standard small complexity classes including AC , TC°, NC 1 , NL and P (as well as their corre- 
sponding function classes) are closed under AC -reductions. 

2.4 Ccv and its complexity class 

A comparator gate is a function C : {0, l} 2 — «■ {0, l} 2 that takes an input pair {p, q) and outputs a pair {p A q,pv q). 
Intuitively, the first output in the pair is the smaller bit among the two input bits p, q, and the second output is 
the larger bit. 

We will use the graphical notation on the right to denote a comparator 

P x 1 p/\q 

gate, where x and y denote the names of the wires, and the direction of 

the arrow denotes the direction to which we move the larger bit as shown ^ P v ^ 

in the picture. 

A comparator circuit can be defined as a directed acyclic graph consisting of: input nodes with in-degree zero 
and out-degree one, output nodes with in-degree one and out-degree zero, and internal nodes with in-degree 
two and out-degree two. Each internal node represents a comparator gate, with one out-edge labeled AND and 
the other labeled OR. If there are m input nodes then there must be m output nodes, and the circuit computes a 
function / : {0, l} m — ► {0, l} m in the obvious way. 

Under this definition each comparator circuit can be represented by m horizontal wires that carry bit values 
(see Fig.[T]on Page[T}, where m is the number of input nodes. Each comparator gate is represented by a vertical 
arrow connecting two of the wires. The arrowhead (representing the OR of the two inputs) can be chosen at will 
to point up or down, but the decision affects which wire future gates connect to. To see this, topologically sort the 
internal nodes of the graph (representing the comparator gates), and arrange the corresponding arrows in order 
from left to right. The left endpoints of the wires represent the m input nodes, and the gates can be placed one 
by one from left to right, each connecting the appropriate wires (determined by looking back to the last output 
gate touching the wire). 

In this paper we present a comparator circuit by specifying its representation as horizontal wires connected 
by arrows, as explained in the previous paragraph. We encode the circuit by a triple {m,n,X), where m is the 
number of wires and n is the number of gates, and X encodes a sequence of n pairs (i,j) with < i,j < m, where 
each pair (i,j) encodes a comparator gate that swaps the values of the wires i and j iff the value on wire z' is 
greater than the value of wire j. For technical reasons, we also allow "dummy" gates of the form (i, i), which do 
nothing. 

► Definition 1 . The comparator circuit value problem (Ccv) is the decision problem: Given a comparator circuit 
and an assignment of bits to the input nodes, decide whether a designated wire outputs one. By default, we often 
let the designated wire be the 0th wire of a circuit. 

The complexity class CC is the class of decision problems that are AC many-one reducible to Ccv. A decision 
problem R is CC-complete if the respective class is the closure ofR under AC many-one reductions. 

In our definition of comparator circuit each comparator gate can point in either direction, up or down (see 
Fig. [1}. As mentioned earlier, an equivalent circuit with the same number of wires and gates can always be 
constructed so that all gates point up, or all gates point down, or in fact each gate can be made to point up or 
down arbitrarily. Transforming a circuit to one of these forms (with the same number of gates) cannot necessarily 
be done by an AC function, but it is not hard to show the following. 
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► Proposition 2. The Ccv problem with the restriction that all comparator gates point in the same direction is 
CC-complete. 



Proof. Suppose we have a gate on the left of Fig.[2]with the arrow pointing upward. We can construct a circuit 
that outputs the same values as those of x and y, but all the gates will now point downward as shown on the right 

of Fig. m 



wwvww> 



xo- 
J'o- 

n- 



Figure 2 



It is not hard to see that the wires X\ and y\ in this new comparator circuit will output the same values as the 
wires x and y respectively in the original circuit. For the general case, we can simply make copies of all wires for 
each layer of the comparator circuit, where each copy of a wire will be used to carry the value of a wire at a single 
layer of the circuit. Then apply the above construction to simulate the effect of each gate. Note that additional 
comparator gates are also needed to forward the values of the wires from one layer to another, in the same way 
that we use the gate (yo, y\) to forward the value carried in wire yo to wire yi in the above construction. 

To carry this out in AC , one way would be to add a complete copy of all wires for every comparator gate in 
the original circuit. Each new wire has input 0. For each original gate, first put in gates copying the values to the 
new wires, and either put in the construction in Fig.|2]or if the gate points down, simply put a copy of the gate in 
the original circuit. ■< 



2.5 The stable marriage problem 

An instance of the stable marriage problem (Sm), proposed by Gale and Shapley | 8 | in the context of college 
admissions, is given by a number n (specifying the number of men and the number of women), together with a 
preference list for each man and each woman specifying a total ordering on all people of the opposite sex. The 
goal of Sm is to produce a perfect matching between men and women, i.e., a bijection from the set of men to the 
set of women, such that the following stability condition is satisfied: there are no two people of opposite sex who 
like each other more than their current partners. Such a stable solution always exists, but it may not be unique. 
Thus Sm is a search problem (see Section|22J, rather than a decision problem. 

However there is always a unique man-optimal and a unique woman-optimal solution. In the man-optimal 
solution each man is matched with a woman whom he likes at least as well as any woman that he is matched 
with in any stable solution. Dually for the woman-optimal solution. Thus we define the man-optimal stable 
marriage decision problem (MoSm) as follows: given an instance of the stable marriage problem together with 
a designated man-woman pair, determine whether that pair is married in the man-optimal stable marriage. We 
define the woman- optimal stable marriage decision problem (WoSm) analogously. 

We show here that the search version and the decision versions are computationally equivalent, and each is 
complete for CC. Section lTTl shows how to reduce the lexicographically first maximal matching problem (which 
is complete for CC) to the Sm search problem, and Section [73l shows how to reduce both the MoSm and WoSm 
problems to Ccv. 

3 Universal comparator circuits 

Here we present a construction of universal comparator circuits. The key idea is a gadget consisting of a com- 
parator circuit with four wires and four gates which allows a conditional application of a comparator to two of its 
inputs x, y , depending on whether a bit Ms or 1 . The other two inputs are b and b (see Fig. [3} . The comparator 
is applied only when b — 1 (see Fig. [4). 
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Figure 3 Conditional comparator gadget 



Figure 4 Operation of conditional comparator gadget 



In order to simulate a single arbitrary comparator in a circuit with m wires we put in m{m - 1) gadgets in a 
row, for the m{m-l) possible comparators. Simulating n comparators requires m{m-\)n gadgets. 

Thus there is an AC function UNIV such that if m, n are arbitrary parameters, then U = UIMIV(m, n) = 
{ml ', n', U') is a universal circuit which simulates all comparator networks with at most m wires and at most n 
comparators, where the number of wires in U is m' — 2m{m - 1) n + m (note that the original m wires are com- 
mon to all of the gadgets) and the number of gates is n' — 4m(m - 1) n. 

The AC function INPUT(C, Y) = Y' computes the input bits Y' for the universal circuit U = UNIV(m,«), 
where C = (m,h,X) with m< m and h< n. Then U with input Y' simulates the circuit C with input Y. We may 
arrange the universal circuit so that the m wires of the original circuit C correspond to wires number 0, 1, . . m— 1 
in U IM I V(m, n) (and the remaining input wires specified by Y' code the inputs to the gadgets in U to simulate the 
comparator gates of C). From this construction, the following theorem follows. 

► Theorem 3. The circuit U N I V(m, n) has the intended universal property. In other words, the comparator circuit 
UIMIV(m, n) on input INPUT(C, Y) outputs on its first m wires precisely the outputs of the m wires of the original 
circuit C on input Y. 

The next result is an important application of universal comparator circuits. It tells us that the class CC can 
be characterized in terms of uniform circuit families, as in the definitions of the complexity classes NC fc and AC fc . 

Let /ccv (x> y> X, Y) be the 0/1 -function that returns the value of the 0th wire of the comparator circuit with x 
wires and y comparator gates encoded in X taking as input the binary string Y. By a slight abuse of notation we 
will write /ccvCC, Y) instead of /ccv( x >y>^> Y), where C = (x,y,X). 

► Theorem 4. For each relation R{X) in CC there is a family {Cf\ km of comparator circuits described by AC 
functions ofk such that the 0th wire ofC^ outputs R{X) when supplied with copies of X(i),-<X{i) for i < k as 
inputs. More precisely, there are AC functions CIRCUIT^ and\N R such that 

\X\<k^ [R{X) ~/ccv(CIRCUITV),IN fl (fc,X)) = l] 

where CIRCUIT^ (fc) encodes a comparator circuit with n gates and Y = \ N R (k,X) consists of (for some polynomial 
p = p(k)) p copies ofX{i) and p copies of-<X{i),for i = 0, 1, k — 1. 



Proof. First observe that the theorem holds in case the relation R{X) is in AC . This is because an AC circuit is 
easily converted to a tree whose internal nodes are AND or OR gates with fan-out one (which can be converted to 
comparator gates) and whose leaves have inputs of the form I H R [k, X ) as described in the theorem. (No universal 
circuit is needed for this.) 

If F{X) is an AC function, then its bit graph Bp{i,X) is an AC relation. Hence the above construction can 
be used to construct an AC°-uniform comparator circuit family CIRCUIT F (A;) which on input \N F {k,X) of the 
form described in the theorem, outputs the bits of F{X). This construction will be applied to the function F{X) 
defined in 1 13.1) below. 

Now suppose R{X) is in CC. Then R{X) is AC many-one reducible to Ccv (see Definition[T}. Hence there are 
AC functions CIR R (Jf) and INP B (X) such that 

R{X) — Ccv(CI R R (X), I N P R (X)). 

Now given k, choose m^ and n k such that for all X with \X\ < k, if (m,n,X') = C\R R {X) then m < m^ and 
n < n/c- We may choose mj and n k to be polynomials in k, because CIR R is an AC function. Then we define the 
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Figure 5 The construction of a comparator circuit for any relation R(X) in CC for a fixed k. 



circuit CIRCUIT R (fc) in the theorem to consist of the universal circuit UNIVtm^, n^i preceded by a comparator 
circuit CIRCUIT F (fc) computing the AC function 



FCX) = INPUT(CIR B (X),INP fl (X)) 



(3.1) 



for \X\ < k, where I N P UT is as in Theorem[3] The outputs of this circuit are connected to the inputs of U N IV {m^, n^) . 
The function \ N R {k,X) is the same as \ N F (k,X), which supplies inputs to CIRCUIT F (fc) (see Fig.[5j. 

Thus by <Q and Theorem^ the Oth wire of UNIV(ra fc ,n fc ) outputs R[X). ■< 

► Corollary 5. The function class FCC is closed under composition. 

Proof. We must show that if G{X) and H{X) are in FCC then so is Ho G{X) . Recall that a function G[X) is in FCC 
iff G{X) is polynomially bounded and its bit graph Be [i, X) is in CC. Thus by Theorem|4]and its proof, a function 
G{X) is in FCC iff there is an AC°-uniform family {C^ } fcEN such that for \X\ < k, when polynomially many copies 
of X{i) and -<X{i), i < k are fed as inputs to the circuit using the AC function IN , then the outputs of the 
bottom wires of C? are the first p{k) bits of G[X], where p[k) is a polynomial upper bound on |G(X)| for \X\ < k. 

The circuit C^ oG can be constructed from polynomially many copies of comparator circuits stacked verti- 
cally (computing many copies of G{X)) that serve as inputs to the circuit C^ fc) computing H. Actually the circuit 
C^ fc) also needs negations of the bits of G[X) as inputs. These are easily supplied, using the fact that if C is a 
comparator circuit and C' is the result of flipping each gate in C (interchanging AND and OR gates) , then (by De 
Morgan's Laws) the output bits of C' with input Y' are the negations of the output bits of C with input Y, where 
Y' is the string of negations of the inputs Y . -4 

The following theorem shows that even though CC is defined as the closure of the Ccv problem under AC 
many-one reducibility, CC is also closed under the stronger AC reducibility as defined in Section lOl 

► Theorem 6. CC is closed under AC reductions. 

Proof. Theorem IX. 1.7 in |4| implies that a function is AC reducible to FCC if and only if it can be obtained from 
FCC by finitely many applications of function composition and string comprehension. The latter is explained by 
the following definition: For a function f[x) which produces only unary outputs and may contain other argu- 
ments, the string comprehension of / is the function F[y) (which outputs binary strings) satisfying 



F(y)(fl~3x<y *=/(*). 
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Thus, it suffices to show that FCC is closed under composition and string comprehension. We know FCC is closed 
under composition by Corollary[5] so it remains to show that FCC is closed under string composition. We can 
show this directly from the definition of FCC, without referring to any of the previous theorems. We just observe 
that a collection of y + 1 instances of Ccv determining whether i - f{x) for x - 0, 1, . . . , y can be combined by AC 
functions of y to form an instance of Ccv determining F(y)(i). < 

Corollary [5] and the fact that NL c CC (see Appendix[A|for a simple proof of NL c CC) give us the following 
result. 

► Theorem 7. CC ii closed under many-one N L reductions. 

Proof. This follows from the following three facts: The function class FCC is closed under composition (by 
CorollaryO, FN L c FCC, and a decision problem is in CC if and only if its characteristic function is in FCC. 

4 Technical overview 

In this section we summarize the main results and techniques used in the remaining sections of the paper so that 
these ideas are presented within the first ten pages. 

4.1 Oracle separations 

In Section[5]we show that the relativized versions of NC and CC are incomparable. Our strategy is to construct 
functions depending only on the oracle a that are computable in one relativized complexity class but not in the 
other. We think of the oracle a as a length-preserving function with n inputs and n outputs. In order to avert 
possible criticism, we make sure that the separation works even under the promise that a has the property that 
if one input bit is changed, then exactly one output bit is changed. (This properly is satisfied by all comparator 
circuits.) An oracle satisfying this properly is called strictly 1-Lipschitz. 

A function in relativized CC but not in relativized NC. We use an idea taken from |1|: the required func- 
tion involves iterating a on the constant input 0", n times. This is trivially implementable in relativized CC. An 
information-theoretic argument from |T | shows that a depth- k circuit can only "know" the value of the kth iter- 
ation of a. Much more work (and a modified function) is needed to get a separation with a strictly 1-Lipschitz 
oracle. 

A function in relativized NC 2 but not in relativized CC. The basic weakness of CC exploited here is the fan- 
out restriction. Suppose the oracle a, given n inputs, yields only n/2 outputs. We can feed the output back into 
another instance of a by using two copies of each output bit. Intuitively, a comparator circuit computing the 
mth iteration of a needs 2 m — 1 copies of a, or alternatively 2" copies of a (and a complicated circuit analyzing 
the output). When m = Q(log 2 n), this construction requires a super-polynomial size comparator circuit comput- 
ing the mth iteration of a. On the other hand, for m = 0(log 2 n), the mth iteration can be easily computed in 
relativized NC 2 . 

While we were unable to prove that the particular function just described is hard for comparator circuits, we 
are able to prove a min(2", 2 m_1 ) lower-bound for a related function. The crucial property of comparator circuits 
we use is the flip-path property: 

If one input wire is changed, then (given that a is 1-Lipschitz) there is a unique path through the circuit 
tracing the effect of the original flip. 

We use a Gray code to order the possible n-bit outputs of the oracle and study the effects of the 2" flip-paths 
generated as the definition of the oracle is successively changed. 
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Figure 6 The thick edges form the lfm-matching of the above bipartite graph. 

4.2 Lexicographically first maximal matching 

Let G = (V, W,E) be a bipartite graph, where V = {i>i}^K , W = {Wi}?~} and E Q V x W. The lexicographically first 
maximal matching (lfm-matching) is the matching produced by successively matching each vertex vq,..., v m -\ 
to the least vertex available in W (see Fig.[6]for an example). We refer to V as the set of bottom nodes and W as 
the set of top nodes. 

In Section[6] we show that two decision problems concerning the lfm-matching of a bipartite graph are CC- 
complete under AC many-one reductions. The lfm-matching problem (Lfmm) is to decide if a designated edge 
belongs to the lfm-matching of a bipartite graph. The vertex version of lfm-matching problem (vLfmm) is to 
decide if a designated top node is matched in the lfm-matching of a bipartite graph G. 

The proof showing that vLfmm is CC-complete can be summarized as follows. 

vLfmm is in CC. The definition of the problem outlines an algorithm. It turns out that this algorithm can be 
implemented using comparator circuits. To make the reduction uniform, we use dummy gates (comparator 
gates that compare a wire to itself). 

vLfmm is CC-hard. We use a simple gadget reduction, in which a comparator gate with inputs po,qo and 
outputs p\ , q\ is represented by four top vertices po , qo , p\ , q\ and two bottom auxiliary vertices. We keep the 
invariant that the value of an input or output is 1 iff its corresponding top vertex is matched. 

We then observe that the reduction from vLfmm to Ccv works even if we restrict the degree of the bipartite 
graphs to at most 3. Thus vLfmm with degree at most 3 is already complete for CC. 

To show that Lfmm is CC-complete, it turns out that we need the following intermediate CC-complete prob- 
lem. For comparator circuits with negation gates, we allow negation gates to appear on any wire. The comparator 
circuit value problem with negation gates (Ccv-i) is: given a comparator circuit with negation gates and input 
assignment, and a designated wire, decide if that wire outputs 1. We can extend the above two constructions to 
show that Lfmm is AC many-one reducible to Ccv-i and Ccv is AC many-one reducible to Lfmm, even when 
the maximum degree is at most 3. Thus Lfmm with degree at most 3 is CC-complete. 

4.3 Algorithms for stable marriage 

Gale and Shapley |8| gave a simple iterative algorithm for solving the stable marriage problem. Subramanian 1 16 
im gave a completely different algorithm which computed both the man-optimal and the woman-optimal stable 
marriages as fixed-points of some iteration; see Feder |6 7| for more on this point of view. Subramanian's algo- 
rithm shows that the stable marriage problem is in CC. It was pointed out in [TT] that Subramanian's algorithm 
can be conveniently presented using three-valued logic (adding a value * representing unknown), which can be 
implemented using "double-rail" logic (see Section [72}. A simple gadget reduction from Lfmm shows that the 
stable marriage problem is CC-complete (see Section [7Ti . 

In Section 17131 we demystify Subramanian's algorithm by presenting a sequence of algorithms starting with 
the standard Gale-Shapley algorithm, and ending with Subramanian's algorithm. The original Gale-Shapley al- 
gorithm computes only the man- optimal stable marriage. Our first step is symmetrizing the algorithm so that 
it computes both the man-optimal and the woman-optimal stable marriages. Both algorithms can be seen as 
keeping track of possible partners for each person. The second step is a novel algorithm, the interval algorithm, 
in which the possible partners of a person p at each given iteration form an interval within the preference list of 
P- 
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The interval algorithm can be implemented as a fixed-point iteration in three-valued logic. The idea is that 
the interval splits the preference list into three parts, and this tripartition can be encoded (in a slightly non- 
obvious way) using three-valued logic. The fixed-point iteration cannot be implemented using comparator cir- 
cuits since some values are used more than once. However there is a way to simplify the iteration so that it can be 
implemented using comparator circuits. The new iteration emulates the old iteration in a time-delayed fashion. 

4.4 Proof of NLcCC 

In Appendix ?? we show how to solve the directed reachability problem in CC. Our proof, also appearing in (TTJ , 
is much simpler than the original one from [16] [12] . The idea is to drop n pebbles into the source vertex. Each 
pebble travels along the lexicographically first maximal path starting from the source vertex, and then the vertex 
at the end of this path is pebbled and excluded from the search. After n iterations, all nodes reachable from the 
source are pebbled, and we can check whether the target is one of them. 

4.5 Open problems 

Although we have shown that there are problems in relativized NC but not in relativized CC (uniform or nonuni- 
form), it is quite possible that some of the standard problems in NC 2 that are not known to be in NL might be 
in (nonuniform) CC . Examples are integer matrix powering and context free languages (or more generally prob- 
lems in LogCFL). Another example is the matching problem for bipartite graphs or general undirected graphs, 
which is in RNC 2 [I0][r4| and hence in nonuniform NC 2 . It would be interesting to show that some (relativized) 
version of any of these problems is, or is not, in (relativized) (nonuniform) CC. 

5 Oracle separations 

Here we support our conjecture that the complexity classes NC and CC are incomparable by defining and sep- 
arating relativized versions of the classes. (See Section []] for a discussion of the conjecture.) Problems in rela- 
tivized CC are computed by comparator circuits which are allowed to have oracle gates, as well as comparator 
gates and -i gates. (By Section l6~4l the -i gates can be elmininated.) Each oracle gate computes some function 
G: {0, 1}" — «■ {0, 1}" for some n. We can insert such an oracle gate anywhere in an oracle comparator circuit with 
m wires, as long as m > n, by selecting a level in the circuit, selecting any n wires and using them as inputs to the 
gate (so each gate input gets one of the n distinct wires), and then the n outputs feed to some set of n distinct 
output wires. Note that this definition preserves the limited fan-out property of comparator circuits: each output 
of a gate is connected to at most one input of one other gate. 

We are interested inoraclesa: {0,1}* -^{0,1}* which are length-preserving, so |a(F)| = We use the no- 
tation a n to refer to the restriction of a to {0, 1}". We define the relativized complexity class CC(a) based on the 
circuit-family characterization of CC given in TheoremH] Thus a relation R[X, a) is in CC(a) iff it is computed 
by a polynomial size family of comparator circuits which are allowed comparator gates, -i-gates, and a n oracle 
gates, where n = \X\. We consider both a uniform version (in which each circuit family satisfies a uniformity 
condition) and a nonuniform version of CC(a). 

Analogous to the above, we define the relativized class HC k [a) to be the class of relations R[X, a) computed 
by some family of depth 0(log fc n) polynomial size Boolean circuits with A, v, -i, and a„-gates (where n = \X\) 
in which A -gates and v -gates have fan-in at most two. As above, we consider both uniform and non-uniform 
versions of these classes. Also NC(a) = (Jit NC fc (a). 

As observed earlier, flipping one input of a comparator gate flips exactly one output. We can generalize this 
notion to oracles a as follows. 

► Definition 8. A partial function a : {0, 1}* {0, 1}* which is length-preserving on its domain is (weakly) 1- 
Lipschitz if for all strings X,X' in the domain of a, if\X\ = \X'\ and X and X' have Hamming distance 1, then a(X) 
and a{X') have Hamming distance at most 1. 
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A partial function a : {0, 1}* — >■ {0, 1}* which is length-preserving on its domain is strictly 1-Lipschitz if for all 
strings X,X' in the domain of a, if \X\ - \X'\ and X and X' have Hamming distance 1, then a{X) and a{X') have 
Hamming distance exactly 1 . 

Since comparator gates compute strictly 1 -Lipschitz functions, it may seem reasonable to restrict comparator 
oracle circuits to strictly 1-Lipschitz oracles a. Our separation results below hold whether or not we make this 
restriction. 

Roughly speaking, we wish to prove CC(a) g NC(a) and NC 2 (a) g CC(a). More precisely, we have the follow- 
ing result. 

► Theorem 9. 

(i) There is a relation R\(a) which is computed by some uniform polynomial size family of comparator oracle 
circuits, but which cannot be computed by any NC(a) circuit family (uniform or not), even when the oracle a 
is restricted to be strictly 1 -Lipschitz. 

(ii) There is a relation R2{a) which is computed by some uniform NC 2 (a) circuit family which cannot be com- 
puted by any polynomial size family of comparator oracle circuits (uniform or not), even when the oracle a is 
restricted to be strictly 1-Lipschitz. 

The restriction to strictly 1-Lipschitz oracles might seem severe. However the following lemma shows how 
to extend every 1-Lipschitz function to a strictly 1-Lipschitz function. As a result, it is enough to prove a relaxed 
version of Theorem[9]where the oracles are restricted to be only (weakly) 1-Lipschitz. 

► Lemma 10. Suppose f is a 1-Lipschitz function. Define g{X) byaddinga 'parity bit' in front of f \X) asfollows: 

g{X) = [parity(X) ®parity(/(X))]/(X). 
Then g is strictly 1-Lipschitz. 

Proof. Suppose d(X, Y) = 1. Clearly d{g{X),g[Y)) < 2. On the other hand, parity(g(X)) = parity(X), and hence 
d{g{X),g{Y)) is odd. We conclude that d{g[X),g{Y)) = 1. < 

Given a relation S(fS) which separates two relativized complexity classes even when the oracle p is restricted 
to 1-Lipschitz functions, define a new relation R{a) = S(chop(a)), where chop(a)„(X) results from a„ + i(0X) by 
chopping off the leading bit. Given an oracle fS which is 1-Lipschitz, define a new oracle a by 

a n+l {bX) = [parity(WO®parityQ6„m)]j8„(X). 

LemmafTolshows that a is strictly 1-Lipschitz. Notice that R{a) = S(/3). Hence R(a) separates the two relativized 
complexity classes even when the oracle a is restricted to strictly 1-Lipschitz functions. 

Henceforth we will prove the relaxed version of Theorem [9] in which the oracles are only restricted to be 
(weakly) 1-Lipschitz. 

5.1 Proof of item (i) of Theorem [9] 

It turns out that item (i) is easy to prove if we require the NC(a) circuit family to work on all length-preserving 
oracles a, and not just 1-Lipschitz oracles. This is a consequence of the next proposition, which follows from the 
proof of 1 1 Theorem 14], and states that the £th iteration of an oracle requires a circuit with oracle nesting depth 
( to compute. 

► Definition 11. The nesting depth of an oracle gate G in an oracle circuit is the maximum number of oracle 
gates (counting G) on any path in the circuit from an input (to the circuit) to G. 

► Proposition 12. Let d,n> andlet C{a) be a circuit with any number of Boolean gates but with fewer than 2" 
a n -gates such that the nesting depth of any a n -gate is at most d. If the circuit correctly computes the first bit ofa e n 
(the £ th iteration ofa n ), and this is true for all oracles a n , then ( < d. 
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We apply Proposition[T2lwith d-n, and conclude that the first bit of a" cannot be computed in NQa). But 
a" t obviously can be computed in CC(a) by placing n oracle gates a n in series. This proves item (i) without the 
1-Lipschitz restriction. 

For the proof of Proposition[l2]we use the following definition and lemma from 1 1. 

► Definition 13. A partial function / : {0, 1}™ — {0, 1}" is called (sequential if (abbreviating 0" by 0) 

0,/(0),/ 2 (0),...,/(0) 
are all defined, but f e {0) ^dom(/). 

Note that in Definition[13]it is necessarily the case that 0, / (0) , f 2 (0) , . . . , / (0) are distinct. 

► Lemma 14. Let n e N and f: {0,1}" — - {0,1}" be an £ -sequential partial function. LetMcz {0,1}" besuch that 
I dom(/) u M| < 2". Then there is an {£ + 1) -sequential extension f 2 / with dom(/') = dom(/) u M. 

Proof. Let Y e {0, 1}" \ (Mudom(/)). Such a Y exists by our assumption on the cardinality of Mu dom(/). Let 
/' be / extended by setting f'{x) - Y for all X e M\ dom(/). This /' is as desired. 

Indeed, assume that 0,/'(0), ... , f' e+l (0),f"' +2 (0) are all defined. Then, since Y g dom(/'), it follows that all 
the 0, f'(0), . . . , f' e+1 (0) have to be different from Y. Hence these values have already been defined in /. But this 
contradicts the assumption that / was /"-sequential. ■< 

Proof of Proposition 1121 We use / to stand for the oracle function a n . Assume that such a circuit computes 
/^(0) correctly for all oracles. We have to find a setting for the oracle that witnesses £ < d. 

By induction on k > we define partial functions : {0, 1}" {0, 1}" with the following properties. 

/()<=/! c/ 2 c... 

■ The size | dom(/jfc) | of the domain of fa is at most the number of oracle gates of nesting depth k or less. 
fk determines the values of all oracle gates of nesting depth k or less. 
flc is ^-sequential. 

We can take fo to be the totally undefined function, since /°(0) = by definition, so fo is 0-sequential. For the 
induction step let M be the set of all strings Y of length n such that Y is queried by an oracle gate at level k. Let 
ftc+i be a k+ 1-sequential extension of to domain dom(/)t) u M according to Lemma [T4l 

For k-d we get the desired bound. As fd already determines the values of all gates, the output of the circuit 
is already determined, but / d+1 (0) is still undefined and we can define it in such a way that it differs from the 
first bit of the output of the circuit. -4 

Now we are ready to prove item (i) for the case when oracles are restricted to be 1-Lipschitz. 

Notation. We use T to stand for both a bit string and the number it represents in binary. The z'th bit of T is 
bit(T, i); the least significant bit (lsb) is bit 1. For a bit b, fo( mtmies ) i s the bit b repeated m times. The Hamming 
weight of a string X is The Hamming distance between X and Y is d{X, Y) = Y\\. The length of X is |X|. 

► Definition 15. Let n- 2 e , and define m = 2n+ 1. Given /: {0, \} me+n -> {0, 1}", define the slice functions 

fx (JO = / (bit (T, 1) (m times) ... bit ( r, t) {m times) X) , where | T\ = £ and \X\ = n. 
Define the iterations 

Xo = Q (n times) _ X T+ 1=MXt). 

Finally, define 



F = bit(x LyHj ,l). 
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► Lemma 16. The function F = F{f) can be computed using a uniform family of comparator circuits of size 
polynomial in n which use f as an oracle^ 

Proof. Obvious. < 

If / ignores its first mi input bits then F is the first bit of of the y/n-th iteration of /, and hence by Proposi- 
tion [T2l any subexponential size circuit computing F requires depth \fn. In the rest of this section we will show 
that even if / is assumed to be 1-Lipschitz, F cannot be computed by any circuit with only polynomially many 
oracle gates which are nested only polylogarithmically deep. 

The first step is to reduce the problem of constructing a 1-Lipschitz / to the problem of constructing 1- 
Lipschitz/o f n -i. 

► Definition 17. Let / be a function as in Definition[T5l Let R\ . . . R(X be an input to /, where | J?j | = m, \X\ = n. 
Suppose Ri contains z/ zeroes and Oj ones. Define f; = if Z; > o; and t\ — 1 if z; < Oj (one of these must happen 
since m is odd). Let xi = min(Z;, o,) and x = max, x;. The values t\, . . . , t( define a string T. We say that R\ . . . R(X 
belongs to the blob B{T,X), and is at distance x from the center string t[ m times) . . . t { ( m times) x. Thus the blobs 
form a partition of the domain {0, \} m(+n of f. 

We say that / is blob-like if for all R\ , . . . , R(,X, with T as defined above, 

/(*!„. R e X) = fr [X) A {0 lx ti ™*h in - x times) ). (5.1) 

(Here we use bitwise A.) In words, the value of / at a point R which is at distance x from the center of some blob 
B is equal to the value of / at the center of the blob, with the first x bits set to zero. 

We say that / is a blob-like partial function if it is a partial function whose domain is a union of blobs, and 
inside each blob it satisfies I5.U . 



Note that the values at centers of blobs are unconstrained by 15.11 because then x - 0. 

► Lemma 18. Iff is blob-like and fj is 1-Lipschitz for all < T < n then f is 1-Lipschitz. 

Proof. Let R\ . . . R(, X be an input to /. We argue that if we change a bit in the input, then at most one bit changes 
in the output. If we change a bit of X, then this follows from the 1-Lipschitz property of the corresponding fj. If 
we change a bit of Ri without changing T, then we change x by at most 1, and so at most one bit of the output 
is affected. Finally, if we change a bit of Ri and this does change T, then we must have had (without loss of 
generality) z,- - n,Oi - n+ 1, and we changed a 1 to to make Zj = n+ 1, o; = n. In both inputs, x — n, and so the 
output is in both cases. 

The second step is to find a way to construct 1-Lipschitz functions from {0, 1}" to itself, given a small number 
of constraints. 

► Lemma 19. Supposeg: {0, 1}" — ► {0, 1}" is a partial function, and g{P) = for all P e dom(g). Let X be a point 
of Hamming distance at least d from any point in dom(g) . Then for every Y of Hamming weight at most d, we can 
extend g to a 1-Lipschitz total function satisfying g{X) = Y. 

Proof. Given X, Y , define h{Z) to be Y with the first mm{d{Z,X), \\Y\\) ones changed to zeros. We have h{X) = Y 
since d{X,X) = 0. For P e dom(g), d{P, X) > d implies h{P) = 0, using || F|| < d. Therefore h extends g. On the 
other hand, h is 1-Lipschitz since changing a bit of the input Z can change d{Z,X) by at most 1, and so at most 
one bit of the output is affected. 

Finally, we need a technical lemma about the volume of Hamming balls. 



1 We can pad the output of / with m£ zeros so that / has the same number of outputs as inputs. 
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► Definition 20. Let n, d be given. Then V{n, d) is the number of points in {0, 1}" of Hamming weight at most 
d, that is 



Vtn,d)= £ 

k<d 



► Lemma 21. Ford>Q, V{n,d+\)IV{n,d) < n+1. 

Proof. Each point in V [n, d + 1) is either already a point of V{n, d), or it can be obtained by taking a point of 
V{n, d) and changing one bit from to 1. Conversely, for each point of V(n, d), a bit can be changed from to 1 
in at most n different ways. •< 

► Corollary 22. IfV{n, d) > r > 1 then there exists d' > such that 

V{n,d) 
r < < [n+ l)r. 

V(n,d') 

Proof. Let d' be the maximum number satisfying r < V{n, d)IV{n,d'). Since r < V{n,d) - V{n, d)/V(n,0), such 
a number exists. On the other hand, 

V{n,d) V{n,d) 

< (n + 1) ; < {n + 1) r. < 



V{n,d') V{n,d' + \) 

We are now ready to prove the main lower bound, which implies item (i) of Theorem[9] 

► Theorem 23 . Let a>0 be given. For large enough n, every circuit C{f) with at most n a oracle gates, nested less 
than \fn deep, fails to compute F for some 1-Lipschitz function f. 

Proof. Put T max = [^/n\ - 1. Let do,..., dr max be a sequence of positive integers satisfying 
V[n,d T ) 



V{n,d T+1 -l) 



>n a , 0<r<T max . (5.2) 



Such numbers exist whenever 2" > {n+ 1) r max(a+i) > w hich holds when n is large enough. Indeed, we will construct 
such a sequence inductively using Corollary[22] keeping the invariant 

2" 

V{n,d T ) 



( n+ l)T[a+l)- 



For the base case, do-n certainly satisfies the invariant. Given dj, use Corollary[22]with d-dr and r — {n + \) a . 
Since V[n,dr) >2 n /(n+ \) Ti - a+r > >{n + \) a+l for large n, the corollary supplies us with d' satisfying 

(n+\) a < ' < [n+ l) fl+1 . 

V{n,d') 

Let dj+i - d' + 1. This certainly satisfies condition 15.2) . and the invariant is satisfied since 

V{n,d T ) 2" 
V{n,d T+ \) > V(n,d') > . > ■ 



{n+\) a+l (n+l)( r+1 )( fl+1 )' 

We will define the function / in r max stages, similar to the proof of Proposition[T2] except we use the notation 
instead of fa. At every stage the function will be a blob-like partial function which defines the output 
of every oracle gate in C{f) of nesting depth k or less. The starting point is / (0) , which is the empty function. At 
stage k we will define the partial function which extends / tfc) , keeping the following invariants: 

m ' is a blob-like partial function. 

m For T < k, fj^ is a total 1-Lipschitz function. 
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■ For T > k, at any point P at which /„ is defined it is equal to and some gate in C(/) of nesting depth k or 
less has its input in the blob B{T,P). Hence | dom(/*')| < n a . 
X k is defined by f {k) . 
f [k) {X k ) is undefined. 
(*) d{P,X k )> d k for any Pedoml/*'). 

■ determines that the output of every oracle gate of nesting depth k or less. 

It is easy to verify that the empty function / (0) satisfies the invariants. The function / trmax) determines the output 
of the circuit C{f). However, = f!^ m! ^ (^r max ) is undefined. We can extend / (T max) i a i-Lipschitz function 
in two different ways: Put fj — for T > r max , and let /r max be either (1) the constant zero function, or (2) the func- 
tion which is zero everywhere except for fr max (^r max ) = O'" -1 tlmes ) l. Since F is different in these two extensions, 
the circuit fails to compute F correctly in one of them. 

It remains to show how to define / (fc+1) given Let <G be the set of oracle gates of nesting depth exactly 
k+ 1. For any Ge G whose input belongs to a blob B{T, X) for T > k, if/* (X) is undefined, then define f lk+1) so 
that it extends and is on the entire blob B(T,X) (this is a blob-like assignment). Let A = domf/^j 1 '); note 
that \A\ < n a . Condition (52) implies 

V{n,d k+l -V)\A\<V{n,d k ), 

and so there is a point Y of Hamming weight at most d k which is of distance at least d k+ \ from each point in 
A. Define f^ +1) {X k ) = Y (so X k+1 = Y), and extend fj ( k+1) to a total 1-Lipschitz function using Lemma [T9l with 
d-d k (use invariant (*)). Then extend f^ k+v > to a blob-like partial function using 15.1) . It is routine to verify that 
the invariants hold for f t - k+ 1J . -< 

► Remark. A more natural target function is F' = bit(X„, 1). We can easily modify the proof of Theoreml23lto 
handle this function. We set the first n - [y/n\ functions / t0) ,...,/ ( "~ Lv/ " J_1) to be constant, and then run the 
proof from that point on. 

An even more natural target function has an unstructured / as input, and F" = bit (/ tiV) (0), l). We leave open 
the question of whether the method can be adapted to work in this case. 

We have previously shown how to construct F'" that separates CC and NC even under the restriction that 
the oracle be strictly 1-Lipschitz. The function F'" basically ignores one of the outputs of the oracle while 
iterating it. It is possible to slightly modify the proof of Theorem|23]so that it directly applies to F even under 
the restriction that the oracle is strictly 1-Lipschitz. We leave this modification as an exercise to the reader. 

5.2 Proof of item (ii) of Theorem [9] 

Here we exploit the 1-Lipschitz property of comparator gates and -i -gates by using oracles which are weakly 
1-Lipschitz, so that all gates in the relativized circuits have this property. 

Let n,m,d e N, with d > 3. For each k e [m] and i e [n\, let a k : {0, \} dn —> {0, 1} be a Boolean oracle with dn 
input bits. Let A k = {a k , a k ). We define a function y = /[A 1 ,. . . , A m ] as follows: 

d times 

,x k+l ,...,x k+l ), ke[m],ie[n], (5.3) 

ie[n], (5.4) 

(5.5) 

As stated the oracle a k has dn inputs and just one output, but we can make it fit our convention that an oracle 
gate has the same number of outputs as inputs simply by assuming that the gate has an additional dn-l outputs 
which are identically zero. 

Note that the function computed by such an oracle is necessarily (weakly) 1-Lipschitz. 



d times 

x i ~ a k (x k+1 , . . . , x k+1 , 



x, m+1 = 0, 
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Let X k = (x*,...,x„) and A k = (a k ,..., a k ). Note that an oracle circuit of depth m + 0(\ogn) with mn gates can 
compute y simply by successively computing X m ,X m -\, ...,X\ and computing the parity of X\ , provided that the 
circuit is allowed to have gates with fan-out d. However, the fan-out restriction for comparator circuits allows us 
to prove the following. 

► Theorem 24. Ifn>3, then every oracle comparator circuit computingy - f [A 1 ,..., A m ] hasatleast 

min(2",(d-2) m - 1 ) 
gates. 

By setting m = log 2 n and d = 4 this almost proves item (ii) of Theorem [9] except we need to argue that the 
array of oracles a k can be replaced by a single oracle. Later we will show how a simple adaptation of the proof of 
Theorem|24]accomplishes this. 

Proof of Theorem 1241 Fix an oracle comparator circuit C which computes y = /[A 1 ,. . .,A m ]. 

► Definition 25. We say that an input (zi,...,z^ n ) to some oracle a k in C is regular if it has the form of the inputs 
in 15.31 ; that is if Z( a -i)d+6 = Z( fl _i)d +C for all ae[n] and b,ce [d] . We say that an oracle a k is regular if a k {Z) = 
for all irregular inputs Z. 

Note that any irregular oracle a k can be replaced by an equivalent regular oracle which does not affect 15.31 . 

► Definition 26. Let g be the total number of any of the gates a k in the circuit C. For a given assignment to 
the oracles, a particular gate a k is active in C if its input is as specified by 15. 315.41 . Let gt be the expected total 
number of active gates a\ , . . . , a k in C under a uniformly random regular setting of all oracles. 

It is easy to see that 

gi > n, (5.6) 

since we need at least one gate a\ for each i e [re]. 
Let k e [m] be greater than 1. We will show that 

5 2" d-2 

We use the following consequence of the (weakly) 1-Lipschitz property of all gates in the circuit: If we change 
the definition of some copy of some gate a k at its input in the circuit C, this generates a unique flip-path which 
may end at some copy of some other gate, in which case we say that the latter gate consumes the flip-path. (The 
flip-path is a path in the circuit such that the Boolean value of each edge in the path is negated.) 

Let Gi, . . ., G2" be a Gray code listing all strings in {0, 1}", where Gi = 0". Thus the Hamming distance between 
any two successive strings G ; : and Gj+i, and between G2" and Gi, is one. Take a random regular setting of all the 
oracles, and let Z\ be the value of X^ under this setting. Shift the above Gray code to form a new one Z\,..., Z?n 
by setting Z t -G t ®Z\. Then for each te [2 n ] , Z t is uniformly distributed and independent of Xg for £ ji k. Thus if 
we change the output of A k at its active input to Z t , the result is again a uniformly random regular oracle setting. 
Let j t be the number of active A k gates (i.e. any active gate of the form a k for some z) after this change, and let 
<Sf be the number of active A k ~ l gates after the change. Taking expectations we have for each t e [2 n ] 

ur t ) = gk, us t )=gk-i- (5.8) 

We will change the output of A k (at its active input) successively from Z\ to Z2«, and consider the relationship 
between y t and 5 1 . The total number of flip paths generated during the process is 



2'-l 

Lrt. 
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Each time an A k 1 gate is rendered active for the first time, we will call the gate fresh. Otherwise, it is reused. At 
time t, let 8' t {5") be the number of fresh (reused) A k ~ l gates. Thus 5" = 0, and 

6 t = 6' t + 6't (5.9) 

Since a given gate can be fresh at most once, we have 

2" 

£<s' f < g (5.10) 

Each time an a k ~ l gate is reused it has consumed at least d - 2 flip-paths since the last time it was active. This is 
because at least d consecutive inputs must be changed from all 0's to all l's (or vice versa), and since the gate is 
regular, its output will be constantly during at least d - 2 consecutive changes. 
Since there must be at least as many flip-paths generated as consumed, we have 

2" 2' 

W-2)JX< £y t . (5.11) 
t=i f=i 

From (5JD, ODD, HTTP we have 

2" t 2" 

T.^-s+-r^Lrt- (5.i2) 

t=i ° z f=i 

Now 15.71 follows from 15.121 and 15.81 by linearity of expectations. 
Hence either g > 2" or 

a>w-2)[ gt _i-i]. 

From this and 15.61 we have a recurrence whose solution shows 



„, t _i {d-2Y-{d-2) 

g t >{d-2Y l n — . 

d - 3 

If n > 3 then g m > (d - 2) m-1 . and Theoreml24lfollows. -< 

Now we change the setting in Theorem[24]so that it applies to a single oracle. The new oracle a{k,i,x) is used 
in the same way as a k {x). The first two arguments can be encoded in binary or unary, and we don't care what 
happens when they are not "legal" (we don't require the output to be unless the x argument is illegal). Define 
active a k gates as gates whose inputs are (fc, i, x), where x is the relevant active input. We argue as before, and 
again conclude that if n > 3 then g m > {d- 2) m_1 . Hence Theoreml24lfollows in the single oracle setting. 



5.3 SC vs CC 

Uniform SC fc is the class of problems decidable by Turing machines running in polynomial time using 0(log fc n) 
space. Non-uniform SC k is the class of problems solvable by circuits of polynomial size and 0(log fc ri) width. Just 
as NC and CC appear to be incomparable, it seems plausible that SC and CC are incomparable. For one direction, 
N L is a subclass of both CC and NC, but is conjectured not to be a subclass of SC (Savitch's algorithm takes 2 0(log ' 
time). For the other direction we can give a convincing oracle separation as follows. 

We apply Theorem[24]to a problem with a padded input of length N, and set the 'real' input n = log 2 N, and 
also m = log 2 N and d = 4. The theorem implies that every comparator circuit solving F has size at least 

2min(m-l,«) _ 2log 2 iV 



which is superpolynomial. Thus this padded problem is not in relativized CC. 
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vLfmm < 3vLfmm 




3LFMM 



Lfmm 



Figure 7 The label of an arrow denotes the section in which the reduction is described. Arrows without labels denote 
trivial reductions. All six problems are CC-complete. 

However, a Turing machine M equipped with an oracle tape which can query 4 log 2 N bits of each oracle a k 
can compute this padded version of F in linear time and 0{n) = 0(log 2 N] space, so this problem is in relativized 
S C 2 . The machine M proceeds by successively computing X m , X m - i,...,X\, writing each of these Xt on its work 
tape and then erasing the previous one. The machine computes X k from X k+1 bit by bit, making a query of size 
4 log 2 N to its query tape for each bit. (We assume that M can access its oracle in such a way that it can determine 
N, and hence m and n.) 

6 Lexicographically first maximal matching problems are CC-complete 

Let G = {V, W, E) be a bipartite graph, where V = {i/j}^ 1 , W = { W{}"^ and E c V x W. The lexicographically first 
maximal matching (lfm-matching) is the matching produced by successively matching each vertex vq, . . . , v m -\ 
to the least vertex available in W. We refer to V as the set of bottom nodes and W as the set of top nodes. 

In this section we will show that two decision problems concerning the lfm-matching of a bipartite graph are 
CC-complete under AC many-one reductions. The lfm-matching problem (Lfmm) is to decide if a designated 
edge belongs to the lfm-matching of a bipartite graph G. The vertex version of lfm-matching problem (vLfmm) 
is to decide if a designated top node is matched in the lfm-matching of a bipartite graph G. Lfmm is the usual 
way to define a decision problem for lfm-matching as seen in |T2j[T7] ; however, as shown in Sections l6. 1 l and lR2l 
the vLfmm problem is even more closely related to the Ccv problem. 

We will show that the following two more restricted lfm-matching problems are also CC-complete. We define 
3Lfmm to be the restriction of Lfmm to bipartite graphs of degree at most three. We define 3vLfmm to be the 
restriction of vLfmm to bipartite graphs of degree at most three. 

To show that the problems defined above are equivalent under AC many-one reductions, it turns out that we 
also need the following intermediate problem. A negation gate flips the value on a wire. For comparator circuits 
with negation gates, we allow negation gates to appear on any wire (see the left diagram of Fig.[TT]below for an 
example). The comparator circuit value problem with negation gates (Ccv~i) is: given a comparator circuit with 
negation gates and input assignment, and a designated wire, decide if that wire outputs 1 . 

All reductions in this section are summarized in Fig. [7] 

6.1 Ccv <^ C ° 3vLfmm 

By Proposition|2]it suffices to consider only instances of Ccv in which all comparator gates point upward. We will 
show that these instances of Ccv are AC many-one reducible to instances of 3vLfmm, which consist of bipartite 
graphs with degree at most three. 
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The key observation is that a comparator gate on the left below closely relates to an instance of 3vLfmm 
on the right. We use the top nodes po and qo to represent the values po an d 1o carried by the wires x and y 
respectively before the comparator gate, and the nodes p\ and q\ to represent the values of x and y after the 
comparator gate, where a top node is matched iff its respective value is one. 



Po 



Pi = Po v qo 
1l = Po A qo 



<wwwvw^ 




If nodes po and qo have not been previously matched, i.e. po — qo — in the comparator circuit, then the edges 
(x, po) and (y, qo) are added to the lfm-matching. So the nodes p\ and q\ are not matched. If po has been 
previously matched, but qo has not, then edges (x, p\) and (y, qo) are added to the lfm-matching. So the node p\ 
will be matched but q\ will remain unmatched. The other two cases are similar. 

Thus, we can reduce a comparator circuit to the bipartite graph of an 3vLfmm instance by converting each 
comparator gate into the "gadget" described above. We will describe our method through an example, where we 
are given the comparator circuit in Fig. [8] 

We divide the comparator circuit into vertical layers 0, 1, 2 as shown in 
Fig. [8] Since the circuit has three wires a, b, c, for each layer i, we use six 
nodes, including three top nodes a,-, foj and c, representing the values of the 
wires a, b, c respectively, and three bottom nodes d v cj, which are auxil- 
iary nodes used to simulate the effect of the comparator gate at layer i. 
Layer 0: This is the input layer, so we add an edge {Xi, x 1 ,} iff the wire x takes 

input value 1. In this example, since b and c are wires taking input 1, we need to add the edges {bo, b' ] and {cq, c' ] 




Figure 8 



a 



bo 



Co 



a 2 



b 2 



C2 



a o b Q c o a \ b 1 Cj a 2 b 2 c 2 

Layer 1: We then add the gadget simulating the comparator gate from wire b to wire a as follows. 

ao 

a 'o 

Since the value of wire c does not change when going from layer to layer 1, we can simply propagate the value 
of Co to ci using the pair of dashed edges in the picture. 

Layer 2: We proceed very similarly to layer 1 to get the following bipartite graph. 




Finally, we can get the output values of the comparator circuit by looking at the "output" nodes d2,b 2 , c 2 of this 
bipartite graph. We can easily check that a 2 is the only node that remains unmatched, which corresponds exactly 
to the only zero produced by wire a of the comparator circuit in Fig.[8] 

It remains to argue that the construction above is an AC many-one reduction. We observe that each gate in 
the comparator circuit can be independently reduced to exactly one gadget in the bipartite graph that simulates 
the effect of the comparator gate; furthermore, the position of each gadget can be easily calculated from the 
position of each gate in the comparator circuit using very simple arithmetics. 
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6.2 vLfmm <^ c ° Ccv 

Consider the instance of vLfmm consisting of the bipartite graph in Fig. [9] Recall that we find the lfm-matching 
by matching the bottom nodes x, y, z successively to the first available node on the top. Hence we can simulate 
the matching of the bottom nodes to the top nodes using the comparator circuit on the right of Fig. [9] where we 
can think of the moving of a 1, say from wire x to wire a, as the matching of node x to node a in the original 
bipartite graph. In this construction, a top node is matched iff its corresponding wire in the circuit outputs 1. 




Figure 9 

Note that we draw bullets without any arrows going out from them in the circuit to denote dummy gates, 
which do nothing. These dummy gates are introduced for the following technical reason. Since the bottom 
nodes might not have the same degree, the position of a comparator gate really depends on the structure of the 
graph, which makes it harder to give a direct AC reduction. By using dummy gates, we can treat the graph as if it 
is a complete bipartite graph, the missing edges represented by dummy gates. This can easily be shown to be an 
AC reduction from vLfmm to Ccv. Together with the reduction from Section ltTTl we get the following theorem. 

► Theorem 27. The problems Ccv, 3vLfmm andvLFMM are equivalent under AC many-one reductions. 

6.3 Ccv <^ c ° 3Lfmm 

We start by applying the reduction Ccv <^-° 3vLfmm of Section |6J1 to get an instance of 3vLfmm, and notice 
that the degrees of the top "output" nodes of the resulting bipartite graph, e.g. the nodes «2> Oz m the example 
of Section l6.ll have degree at most two. Now we show how to reduce such instances of 3vLfmm (i.e. those whose 
designated top vertices have degree at most two) to 3Lfmm. Consider the graph G with degree at most three and 
a designated top vertex b of degree two as shown on the left of Fig. [10] We extend it to a bipartite graph G' by 
adding an additional top node w t and an additional bottom node wt, alongside two edges {b, w\,\ and {w t , Wb\, 
as shown in Fig.[lO] Observe that the degree of the new graph G' is at most three. 




Figure 10 

We treat the resulting bipartite graph G' and the edge {w t, Wb\ as an instance of 3Lfmm. It is not hard to see 
that the vertex b is matched in the lfm-matching of the original bipartite graph G iff the edge {w t , Wj,} is in the 
lfm-matching of the new bipartite graph G'. 

6.4 Ccv-i <£ C ° Ccv 



Recall that a comparator circuit value problem with negation gates (Ccv~i) is the task of deciding, given a com- 
parator circuit with negation gates and an input assignment, whether a designated wire outputs one. It should 
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x- 

1 y- 

1 z- 



Figure 11 Successive gates on the left circuit correspond to successive boxes of gates on the right circuit. 



be clear that Ccv is a special case of Ccv-i and hence AC many-one reducible to Ccv~i. Here, we show the 
nontrivial direction that Ccv~i <^ c Ccv. Our proof is based on Subramanian's idea from H71 . 

The reduction is based on "double-rail" logic. Given an instance of Ccv~i consisting of a comparator circuit 
with negation gates C with its input I and a designated wire s, we construct an instance of Ccv consisting of a 
comparator circuit C' with its input and a designated wire s' as follows. For every wire w in I we put in two 
corresponding wires, w and ~w, in C'. We define the input I' of C' such that the input value of ID is the negation 
of the input value of w. We want to fix things so that the value carried by the wire ~w at each layer is always the 
negation of the value carried by w. For any comparator gate (y, x) in C we put in C' the gate (y, x) followed by the 
gate (x,~y). It is easy to check using De Morgan's laws that the wires x and y in C' carry the corresponding values 
of x and y in C, and the wires x and y in C' carry the negations of the wires x and y in C. 

The circuit C' has one extra wire t with input value to help in translating negation gates. For each negation 
gate on a wire, says z in the example from Fig.QT] we add three comparator gates (z, t), (z,z), (t,z) as shown in 
the right circuit of Fig.Q~JJ Thus t as a temporary "container" that we use to swap the values carried by the wires 
z and z. Note that the swapping of values of z and z in C' simulates the effect of a negation in C. Also note that 
after the swap takes place, the value of t is restored to 0. (The more straightforward solution of simply switching 
the wires z and z does not result in an AC many-one reduction.) 

Finally note that the output value of the designated wire s in C is 1 iff the output value of the corresponding 
wire s in C' with input 7' is 1. Thus we set the designated wire s' in I' to be 5. 

6.5 Lfmm <£ c ° Ccv-i 



Consider an instance of Lfmm consisting of the bipartite graph on the left of Fig. 02] and a designated edge {y, c}. 
Without loss of generality, we can safely ignore all top vertices occurring after c, all bottom vertices occurring 
after y, and all the edges associated with them, since they are not going to affect the outcome of the instance. 
Using the construction from Section [6721 we can simulate the matching of the bottom nodes to the top nodes 
using the comparator circuit in the upper box on the right of Fig.[T2l 
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Figure 12 



We keep another running copy of this simulation on the bottom (see the wires labelled a' ,b' ,c' ,x' ,y' in 
Fig- [ID- The only difference is that the comparator gate (y',c') corresponding to the designated edge {y,c} is 
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not added. Finally, we add a negation gate on d and a comparator gate (c',c). We let the desired output of the 
Ccv instance be the output of c, since c outputs 1 iff the edge {y, c] is added to the lfm-matching. It is not hard to 
generalize this construction to an arbitrary bipartite graph and designated edge. 

Combined with the constructions from Sections l6.1l and l6?2l we have the following corollary. 

► Corollary 28. The problems Ccv, 3vLfmm, vLfmm, Ccv-i, 3Lfmm andhFMM are equivalent under AC many- 
one reductions. 

Since Ccv-i is complete for CC, we can use comparator circuits to decide the complement of the Ccv prob- 
lem: given a comparator circuit and and input assignment, does a designated wire output zero? Thus, we have 
the following corollary. 

► Corollary 29 (Subramanian [12]). CC is closed under complementation. 

7 The Sm problem is CC-complete 

7.1 3Lfmm is AC many-one reducible to Sm, MoSm and WoSm 

We start by showing that 3Lfmm is AC many-one reducible to Sm when we regard both 3Lfmm and Sm as search 
problems. (Of course the lfm-matching is the unique solution to 3Lfmm formulated as a search problem, but it 
is still a total search problem.) 

Let G = ( V, W, E) be a bipartite graph from an instance of 3Lfmm, where V is the set of bottom nodes, W is 
the set of top nodes, and E is the edge relation such that the degree of each node is at most three (see the example 
in the figure on the left below). Without loss of generality, we can assume that | V\ = | W\ = n. To reduce it to an 
instance of Sm, we double the number of nodes in each partition, where the new nodes are enumerated after 
the original nodes and the original nodes are enumerated using the ordering of the original bipartite graph, as 
shown in the diagram on the right below. We also let the bottom nodes and top nodes represent the men and 
women respectively. 




It remains to define a preference list for each person in this Sm instance. The preference list of each man 
mi, who represents a bottom node in the original graph, starts with all the women wj (at most three of them) 
adjacent to m ( - in the order that these women are enumerated, followed by all the women w n ,..., W2n-\\ the 
list ends with all women wj not adjacent to m ; - also in the order that they are enumerated. For example, the 
preference list of mz in our example is u>2, u/3, w^, u>s, wq, w\. The preference list of each newly introduced man 
m n+ i simply consists of wq, w n -\, w n ,..., W2n-\, ie., in the order that the top nodes are listed. Preference lists 
for the women are defined dually. 

Intuitively, the preference lists are constructed so that any stable marriage (not necessarily man- optimal) of 
the new Sm instance must contain the lfm-matching of G. Furthermore, if a bottom node u from the original 
graph is not matched to any top node in the lfm-matching of G, then the man m ( - representing u will marry some 
top node w n+ j, which is a dummy node that does not correspond to any node of G. 

The above construction gives us a AC many-one reduction from 3Lfmm to Sm as search problems, any 
solution of a stable marriage instance constructed by the above reduction providing us all the information to 
decide whether an edge is in the lfm-matching of the original 3Lfmm instance. The key explanation is that every 
instance of stable marriage produced by the above reduction has a unique solution; thus the man-optimal solu- 
tion coincides with the woman-optimal solution. Moreover, the above construction also shows that the decision 
version of 3Lfmm is AC many-one reducible to either of the decision problems MoSm and WoSm. Hence we 
have proven the following theorem, whose detailed proof can be found in 1 5 1 . 

► Theorem 30. 3Lfmm is AC many-one reducible to Sm, MoSm and WoSm. 
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7.2 Three-valued Ccv is CC-complete 



In the remainder of the section, we will be occupied with developing an algorithm due to Subramanian fT6l[T7l 
that finds a stable marriage using comparator circuits, thus furnishing an AC reduction from Sm to Ccv. To this 
end, it turns out to be conceptually simpler to go through a new variant of Ccv, where the wires are three-valued 
instead of Boolean. 

We define the Three-valued Ccv problem similarly to Ccv, i.e., we want to decide, on a given input assign- 
ment, if a designated wire of a comparator circuit outputs one. The only difference is that each wire can now take 
either value 0, 1 or *, where a wire takes value * when its value is not known to be or 1. The output values of 
the comparator gate on two input values p and q will be defined as follows. 



if p = or g = 
p f\q-\\ iip- q-\ 
otherwise. 



pv q-< 



if p = q = 

1 if p = 1 or ^ = 1 
* otherwise. 



Clearly every instance of Ccv is also an instance of Three-valued Ccv. We will show that every instance of 
Three-valued Ccv is AC many-one reducible to an instance of Ccv by using a pair of Boolean wires to rep- 
resent each three-valued wire and adding comparator gates appropriately to simulate three-valued comparator 
gates. 

► Theorem 31. Three-valued Ccv and Ccv are equivalent under AC many-one reductions. 

Proof. Since each instance of Ccv is a special case of Three-valued Ccv, it only remains to show that every 
instance of Three-valued Ccv is AC many-one reducible to an instance of Ccv. 

First, we will describe a gadget built from standard comparator gates that simulates a three-valued compara- 
tor gate as follows. Each wire of an instance of Three-valued Ccv will be represented by a pair of wires in an 
instance of Ccv. Each three-valued comparator gate on the left below, where p,q,p A q,pv q e {0, 1, *}, can be 
simulated by a gadget consisting of two standard comparator gates on the right below. 
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The wires x and y are represented using the two pairs of wires {x\,X2) and {y\,y2>, and three possible values 
0, 1 and * will be encoded by (0, 0>, (1,1), and (0, 1) respectively. The fact that our gadget correctly simulates the 
three-valued comparator gate is shown in the following table. 
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Using this gadget, we can reduce an instance of Three-valued Ccv to an instance of Ccv by doubling the 
number of wires, and replacing every three-valued comparator gate of the Three-valued Ccv instance with a 
gadget with two standard comparator gates simulating it. 

The above construction shows how to reduce the question of whether a designated wire outputs 1 for a given 
instance of Three-valued Ccv to the question of whether a pair of wires of an instance of Ccv output (1,1). 
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However for an instance of Ccv we are only allowed to decide whether a single designated wire outputs 1. This 
technical difficulty can be easily overcome since we can use an A -gate (one of the two outputs of a comparator 
gate) to test whether a pair of wires outputs (1,1), and output the result on a single designated wire. -4 

7.3 Algorithms for solving stable marriage problems 

In this section, we develop a reduction from Sm to Ccv due to Subramanian (T6j|T7] , and later extended to a more 
general class of problems by Feder |6 7\. Subramanian did not reduce Sm to Ccv directly, but to the network 
stability problem built from the less standard X gate, which takes two inputs p and q and produces two outputs 
p' - p A -i q and q' = -i p A q. It is important to note that the " network" notion in Subramanian's work denotes a 
generalization of circuits by allowing a connection from the output of a gate to the input of any gate including 
itself, and thus a network in his definition might contain cycles. An X-network is a network consisting only of X 
gates under the important restriction that each X gate has fan-out exactly one for each output it computes. The 
network stability problem for X gates (Xns) is then to decide if an X-network has a stable configuration, i.e., a way 
to assign Boolean values to the wires of the network so that the values are compatible with all the X gates of the 
network. Subramanian showed in his dissertation 1 16 that Sm, Xns and Ccv are all equivalent under log space 
reductions. 

We do not work with Xns in this paper since networks are less intuitive and do not have a nice graphical 
representation as do comparator circuits. By utilizing Subramanian's idea, we give a direct AC reduction from 
Sm to Ccv, using the three-valued variant of Ccv developed in Section lT^l 

We will describe a sequence of algorithms, starting with Gale and Shapley's algorithm, which is historically 
the first algorithm solving the stable marriage problem, and ending with Subramanian's algorithm. 

7.3.1 Notation 

Let M denote the set of men, and W denote the set of women; both are of size n. The preference list for a person 
p is given by 

n\{p) > p 7t 2 {p) >p---> P n n {p) > p 1. 

The last place on the list is taken by the placeholder ± which represents p being unmatched, a situation less 
preferable than being matched. If pisamanthen7ri(p),...,?r n (p) are women, and vice versa. 

The preference relation > p is defined by niip] > p 7ij{p) whenever i < j; we say that p prefers 7r,(p) over 
iij{p). For a set of women Wq and a man m, the woman m prefers the most is max m Wq; if Wo is empty, then 
max m Wq = ±. Let S be a set, then we write q > p S to denote that p prefers q to any person in S; similarly, q < p S 
denotes that p prefers any person in S to q. 

A marriage P is a set of pairs {m, w) which forms a perfect matching between the set of men and the set of 
women. In a marriage P, we let P{p) denote the person p is married. A marriage is stable if there is no unstable 
pair {m, w), which is a pair satisfying w > m P{m) and m > w P{w), i.e.m and w prefer each other more than their 
current partner. 

7.3.2 Gale-Shapley 

Gale and Shapley's algorithm 1 8 1 proceeds in rounds. In the first round, each man proposes to his top woman 
among the ones he hasn't proposed, and each woman selects her most preferred suitor. In each subsequent 
round, each rejected man proposes to his next choice, and each woman selects her most preferred suitor (includ- 
ing her choice from the previous round). The situation eventually stabilizes, resulting in the man- optimal stable 
marriage. 

There are many ways to implement the algorithm. One of them is illustrated below in AlgorithmQ] The crucial 
object is the graph G, which is a set of possible matches. Each round, each man m selects the top woman top(m) 
currently available to him. Among all men who chose her (if any), each woman w selects the best suitor best(w). 
Whenever any man m is rejected by his top woman w, we remove the possible match (m,w) from G. 
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Algorithm 1 Gale-Shapley 
G <- {{m, w) : me M,w eW] 
repeat 

top(m) <— max m {if : (m, if) e G} for all m e M 
best(w) <— max^jm : top(m) = w} for all w e W 
Remove (m,top(m)) from G whenever best(top(m)) ^ m 

until G stops changing 

return {(m,top(ra)) : m e M} 



► Lemma 32. Algorithm[]]returns the man-optimal stable matching, and terminates after at most n 2 rounds. 

Proof. Admissibility: If a pair {m, w) is removed from G, then no stable marriage contains the pair (m, w). This 
is proved by induction on the number of pairs removed. A pair {m, w) is removed when w = top(m) = top(m') 
for some other man m', and m! > w m. Suppose for a contradiction that P is a stable marriage and if P(m) = w. 
By the induction hypothesis, we know that m' can never be married to any woman w' such that w' > m i w since 
that edge [m 1 , w') was removed previously. Thus w > m i P{m'). But then (m', w) would be an unstable pair, a 
contradiction. 

Definiteness: For all men m and at all times, top(m) ^ ±. For any man m, [m,n n [m)) is never removed from 
G, and so top(m) is always well-defined. Indeed, for each w, after each iteration best(w) is non-decreasing in 
the preference order of w. So if [m, w) t G, best(w) > w m. On the other hand, for any two women w and w', if 
best(w),best(w/) ^ ± then best(w) ^ best(w'). Thus, if [m, w)tG for all w eW, then best is a injective mapping 
from W into M \ {m}, contradicting the pigeonhole principle. 

Completeness: The output of the algorithm is a marriage. The algorithm ends when best(top(m)) = m for 
every m, which implies that best and top are mutually inverse bijections. 

Stability: The output of the algorithm is a stable marriage. Suppose (m, w) were an unstable pair, so at the 
end of the algorithm, m > w best(w) and w > m top(m) (we're using the fact that top and best are inverses at the 
end of the algorithm). However, m > w best(w) implies top(m) ^ w, which implies top(m) > m w. 

Optimality: The output of the algorithm is the man- optimal stable marriage. This is obvious, since each man 
gets his best choice among all possible stable marriages. 

Runtime: The algorithm terminates in n 2 iterations since at most n 2 edges can be deleted from G. 

Gale-Shapley has one disadvantage: it only computes the man-optimal stable matching. This is easy to rectify 
by symmetrizing the algorithm, resulting in Algorithm^ While in the original algorithm, only the men propose 
(select their top choices), and only the women accept or reject (choose the most promising suitor), in the sym- 
metric algorithm, both sexes participate in both tasks in parallel. The algorithm returns both the man-optimal 
and the woman-optimal stable marriages. 

Algorithm 2 Symmetric Gale-Shapley 
G ■— {{m, w):meM,weW] 
repeat 

top(m) <— max m {if : [m, w) e G] for all me M 
top(w) «- max w {m : (m, w) £ G} for all w £ W 
best(w) <— max w {m : top(m) = w] for all w e W 
best(m) <— max m { w : top(w) = m\ for all me M 
Remove (m,top(m)) from G whenever best(top(m)) ^ m 
Remove (top( w), w) from G whenever best(top(w/)) ^ w 
until G stops changing 

return {(m,top(m)) : m e M\ and {(top(w), w) : w e W} as the man-optimal and the woman-optimal stable 
marriages respectively 
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► Lemma 33. Algorithm\^returns the man-optimal and woman-optimal stable matchings, and terminates after 
at most n 2 rounds. 

Proof. The analysis is largely analogous to the analysis of the original algorithm. Every pair {m, w) removed 
from G belongs to no stable marriage. Furthermore, since a stable marriage exists, top(m) and top(w) are always 
defined after the algorithm finishes. At the end of the algorithm, top and best are mutually inverse bijections on 
MuW, hence the outputs are marriages. The same arguments as before show that the marriages returned by the 
algorithm are man-optimal and woman-optimal stable marriages respectively. Finally, the algorithm terminates 
in n 2 iterations since we can only remove at most n 2 edges G. 

7.3.3 Interval algorithms 

At the end of Algorithm^ for each man m, his partner in the man-optimal stable marriage is top(m), while his 
partner in the woman-optimal stable marriage is best(m). The same holds for women (with the roles of the 
sexes reversed). This prompts our next algorithm, Algorithm|3] which explicitly keeps track of an interval J{p) of 
possible matches for each person p (these are intervals in the person's preference order). 

At each round, each person p first picks their top choice top(p). Then each person q picks their top suitor 
best(^), if any. People over whom best(^) is preferred are removed from J{q). If p was rejected by his top choice 
top(p), then top(p) is removed from J{p). These update rules maintain the contiguous nature of the intervals. 
The situation eventually stabilizes, and the algorithm returns the man- optimal and the woman- optimal stable 
marriages. 

Algorithm 3 Interval algorithm 
/o(m) «- W for all m e M 
Joiui) *- M for all w e W 

repeat 

top t (p) «- max p Jt{p) for all p e Mu W 

best t (g) «- maxq{p : q = top f (p)} for all q e Mu W 

Remove p from J t (q) whenever p < q bestf (<jf), for all p, q of opposite sex 

Remove top t (p) from J tip) if p ^ best t (top f (p)) 

t — f+1 

until Jt+iip) - Jtip) for all p e Mu W 

return {(m,max m / t (m)) : m e M} and { (max w J t ( w) , w): we W} as the man-optimal and the woman-optimal 
stable marriages respectively 



► Lemma 34. Algorithm^returns the man-optimal and woman-optimal stable matchings, and terminates after 
atmostln 2 rounds. Furthermore, the man-optimal and woman-optimal matchings are given by 

{(m,max/ ( (m)) : m e M} and {(max J t iw), w) : w e W} respectively. 

Proof. Admissibility: In every stable marriage, every person p is matched to someone from Jip) . This is proved 
by induction on the number of rounds. A person q can be removed from Jip) for one of two reasons: either 
q < p best(p), or q — top(p) and p ^ best(^). In the former case, if p were matched to q, then (p,best(p)) would 
be an unstable pair, since p = top(best(p)) implies that best(p) prefers p to any other partner in /(best(p)). In the 
latter case, if p were matched to q = top(p), then (g,best(g)) would be an unstable pair, since q prefers best(^) 
over p by definition, and best(^) prefers q as in the former case. 

The remaining analysis of this algorithm is similar to the analysis of Gale-Shapley. The outputs of the algo- 
rithm are marriages, since the algorithm ends when best(top(p)) = p for all p, hence top and best are inverse 
bijections. The marriages are stable for the same reason given for Gale-Shapley. They are man-optimal and 
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woman-optimal for the same reason. The number of iterations is at most 2n 2 since there are 2n intervals, each 
of initial length n. 

Finally, at the termination of the algorithm, best(<7) = max^ J(q) . Since top and best are inverses, this explains 
the dual formulas for the man-optimal and woman-optimal matchings. •< 

Our next algorithm introduces a new twist. Instead of removing top(p) from J{p) whenever p ^ best(top(p)), 
we remove top(p) from J{p) whenever p t /(top(p)) as shown in Algorithm!?] The idea is that if at some point 
p ^ best(top(p)), then best(top(p)) >top(p) P, so p is removed from /(top(p)). At the following iteration, top(p) 
will be removed from J{p) in reciprocity. Thus, Algorithmic emulates Algorithm|3]with a delay of one round. We 
will later show that the advantage of this strange rule is the nice representation of the same algorithm in three- 
valued logic which can then be transformed to Subramanian's algorithm, implementable by comparator circuits. 

Algorithm 4 Delayed interval algorithm 
/o(m) «- W for all me M 
Jo{w) -— M for all w e W 
t^O 
repeat 

top t (p) ^max p /(p) for all peMuW 

bestf (q) *- ms&qip : q = top f (p)} for all q e Mu W 

Remove p from J t [q) whenever p < q best f [q), for all p, q of opposite sex 

Remove top t (p) from J t {p)\lpt J t (top t {p)) 

until J t +\{p) = Jt(p) for all p e Mu W 

return {(m,max m / t (m)) : m e M\ and { (max w J t ( w) , w) : we W} as the man-optimal and the woman-optimal 
stable marriages respectively 



► Lemma 35. Algorithm\^returns the man-optimal and woman-optimal stable matchings, and terminates after 
atmostln 2 rounds. Furthermore, the man-optimal and woman-optimal matchings are given by 

{(m,max/ ( (m)) : m e M\ and {(max J t [w),w):we W\ respectively. 

m w 

Proof. Clearly Algorithm H] is admissible, that is p is matched to someone from J{p) in any stable matching. 
Furthermore, at the end of the algorithm, p = best(top(p)). Otherwise, there are two cases. If p e /(top(p)), then 
p would be removed from /(top(p)), and the algorithm would continue. If p t /(top(p)), then top(p) would be 
removed from ]{p), and the algorithm would continue; note that by definition, at the beginning of the round, 
top(p) e](p). 

The rest of the proof follows the one for Algorithm[3] 

► Corollary 36. The intervals at the end ofAlgorithm^coincide with the intervals at the end of Algorithm^ 

Proof. That follows immediately from the two formulas for the output. 

The delayed interval algorithm can be implemented using three-valued logic. The key is the following encod- 
ing of the intervals using matrices, which we call the matrix representation. 





[l 


if w > m max m /(m) 


M{m, w) = < 


* 


ifmax m /(m)> m w>,„ min m /(m) 




[o 


if min m /(m) > m w 




1° 


if m> w max w Kw) 


W{w,m) = \^ 


1* 


if max^ ]{w)> w m> w mm w /( w) 


1 


ll 


if mm w ]{w) > w m 
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In other words, for every man m, the array Jl{m,n\{m)), . . .,Jl{m,n n (m)) has the form 
1 



1 



••• 



where the men whose corresponding values are contained in the box are precisely the men in }{m). 
For every woman w, the array TV{w,n\ {w )),..., W[w,n n {w)) has the form 











1 ••• 1 



where the women whose corresponding values are contained in the box are precisely the women in J{w). 

Algorithm [5] is an implementation of Algorithm [4] using three- valued logic. We will show, in a sequence of 
steps, that at each point in time, the matrices representing the intervals in Algorithm [4] equal the matrices in 
Algorithm^ 

Algorithm 5 Delayed interval algorithm, three-valued logic formulation 



1 if w = n\{m) 
* otherwise 
\fm-n\{w) 
k * otherwise 

repeat 

' 1 if i = 1 

Jl t {m,iii-\ (m)) A A;<i-i Wt{iij{m),m) otherwise 
if i = 1 

W t {w,Jii-i{w}) vVj<i-i^i(Jtj(w), w) otherwise 



Jtoim, w) - 
Wo{w,m) = 



Jt t +i(.m,Hi[m)) = 



W t+ i(UI,Tli{w)) ■■ 



t «- t + 1 
until Jl t = Jt t -\ and W t = W,-\ 

Sm *- {{m, w) : Jl t {m, w ) = 1 and W t (w, m) e {0, *}} % man-optimal stable marriage 

Sw *- {{m, w) : W t [w, m) — and Jl t {m, w) e {1, *}} % woman-optimal stable marriage 

return Sm,Sw 

First, we show that the matrices properly encode intervals. 
► Lemma 37. At each time t in the execution of Algorithm^ and for each man m, the sequence 

Mt{m,n\{m)), . ..,Jl t {m,n n {m)) 
is non-increasing ( with respect to the order 1 > * > 0). Similarly, for each woman w, the sequence 

W t {w,Tii{w)), . . . ,W t (w,n n {w)) 
is non- decreasing 

Proof. The proof is by induction. The claim is clearly true at time t — 0. For the inductive case, it suffices to 
analyze Jl since W can be handled dually. Furthermore, at each iteration, we "shift" each sequence Jl(m, •) one 
step to the right and add a 1 to the left end to get the following non-increasing sequence 

\,Mt-\{m,ni{m)),J{t-i{m,n2[m)), . . . ,Jl t -\{m,ji n -\[m)) 
Then we take a component-wise AND of the above sequence with the non-increasing sequence 

1, W t -i {iii [m),m), (W t -i(7ti (m), m) a W t -i{n 2 {m), m)), {W t -i{ni (m), m) A • • • A W t -\ {n n -\ (m), m)). 
It's not hard to check that the result is also a non-increasing sequence by the properties of three-valued logic. ■* 
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Second, we show that the intervals encoded by the matrices can only shrink. This is the same as saying that 
whenever an entry gets determined (to a value different from * ) , it remains constant. 

► Lemma 38. If for some time t, for some man m and some woman w, Jt t {m, w) e {0, 1}, then M s {m, w) = 
Jt t {m, w) for s > t. A similar claim holds for W. 

Proof. We prove the claim by induction on t. Let w = ni{m). If i = 1, then the claim is trivial. Now suppose i > 1. 
If Jl t {m,ni{m)) = 1, then 

Jit-\{m,iti-\{mj) = W t -\{Ti\{m),m) Wt-i(jti-i(m),m) = 1. 

The induction hypothesis shows that all these elements retain their values in the next iteration and hence Jlt+\ (m, ni + \ {m)) = 
1. If M t {m, 7i i (m)) = 0, then at least one of these elements is equal to zero; this element retains its value in the 
next iteration by the induction hypothesis; hence Ji t +\ {m, rti+i (m)) = 0. 

It remains to show that the way that the underlying intervals are updated matches the update rules of Algo- 
rithm|U 

► Lemma 39. At each time t, the matrix representation of the intervals in Algorithm^is the same as the matrices 
Jl t , W t in the execution of Algorithm^ Furthermore, both algorithms return the same marriages. 

Proof. The proof is by induction on t . The base case t = is clear by inspection. 

We now compare the update rules in some round t of both algorithms. There are two ways an interval J[m) 
can be updated: either a woman is removed from the bottom of the interval, or a woman is removed from the 
top of the interval. 

In the former case, a woman 7ii{m) is removed from J t+ \{m) since Tti{m) < m best t (m). Suppose bestf(m) = 
Ttj{m), where j < i. Since m = top t {n j{m)), we know that W t {nj{m), m) — 0, and so 

Jit+\[m,Tii[m))- M t {m,ni-\{m)) A f\ W t [nj{m),m)-Q. (7.1) 

;<i-l 

Conversely, suppose Jl t+ \[m,Tti{mL)] = while M t {m,Tii{m)) = *. Since Jl t (m,-) is non-increasing, we have 
Mt{m,Tii-\{m)) ^ 0. Thus for Mt+\{m,Tti{m)) — 0, we must have Wt{nj{m), m) = for some j < i. Now suppose 
m ^ topf(7Tj(m)), then that equation 17.11 were true at an earlier time 5 < t, at which M s {m,Tii[m)) would have 
become 0. Hence m — top t [nj{m) ), and Tii[m) is removed from Jt+\[m). 

In the latter case, a woman iii{m) is removed from J t+ i{m) since niim) - top f (m) and m t Jtiitiim)). We 
claim that m < jri (m) Ji^i(rn))y since otherwise m > Jli { m ) J{7ii{m)). Thus (m,^;(m)) would be an unstable pair in 
any marriage produced by the algorithm (m will be matched to a woman inferior to top f (m) = 7ii{m), and 7ii(m) 
will be matched to a man from Jtyitm)) whom niim) doesn't like as much as m), and this contradicts that the 
algorithm produces some stable marriage. Hence we have m < ni {m) Ji^iim)) and so W t {ni{m), m) - 1. For j < 
i—l, the woman ttj (m) must have been removed from J[m) in the past (since women can only be removed from 
the top of J{m) one at a time), and at that time W{jij (m), m) — 1 (just as at the current time, Wt{%i {m), m)- 1); by 
Lemma[38] this value stays 1. Hence 

M t+ \[m,iii + i{m)) = M t {m,ni{m)) A f\ W t {jij{m), m) = 1. 

Conversely, suppose Mt+\{m,n ; + i(m)) = 1 while Jtt{m,n i+\{m)) = *. Since Mt(m,jii{m)) = 1, we must have 
n j ( m) — top t ( m) . Further, TV t {n i ( m) , m) — 1 shows that mt Jt[Tti{m)). Therefore n ,• ( m) is removed from J t+ i{m). 

Finally, we conclude that the two algorithms return the same man-optimal and woman-optimal stable mar- 
riages by Lemma l37l •* 



30 



Let us illustrate the workings of Algorithm [5] The following diagram illustrates a situation in which a man's 
interval shrinks from the bottom. The diagram illustrates W t {w, ■),M t {m,-),M t+ \ (m, •), respectively, where m = 
top t (w). The two elements in red form a pair Jl{m, w),W[w, m). 



in 



W t (w,-) 



* * * * * 



1 1 



w 



M t {m, 1 1 



nCm,-) 1 1 



1 * * * 







1 * 







The next diagram illustrates a situation in which a man's interval shrinks from the top. This time w = top t ( m) . 

W t {w,-) 



* * * * * 



rn 
1 1 



Jl t {m,-) 1 1 



1 * * * 



1 1 1 



1 * * 








7.3.4 Subramanian's algorithm 

Subramanian's algorithm is very similar to Algorithm [5] The latter algorithm is not implementable using com- 
parator circuits, since (for example) the value W t {n\ (m), m) is used in the update rule of 

M t+ i{m,n 2 {m)),...,M t+ \{m,Ti n {m)). 

Subramanian's algorithm, displayed as Algorithm[6] corrects this issue by retaining only the most important term 
in each conjunction and disjunction. 

Algorithm 6 Subramanian's algorithm 



JlQ{m, w) = 



Wo{w,m) 



1 if w = n\{m) 
* otherwise 
if m- Tii{w) 
* otherwise 



Mt+\{m,Tii[m)) = 
W t+i _{w,iii{w)) 



t^0 
repeat 

f 1 if i = 1 

\Jl t {m,iii-\ (m)) A W t {n i-\{m) , m) otherwise 

1 if i = 1 

[ W t { w, n i _ i ( w) ) v M% {jii-\{w),w) otherwise 
t — t+l 
until Mt = Mt-\ and W t - W,-\ 
Sm *- {(m, u>) : Jl t {m, w) = 1 and W t (w, m) e {0, *}} 
Sw {(.m, u>) : W t {w, m) = and Jt t {m, w) e {1, *}} 
return Sm,Sw 



% man- optimal stable marriage 
% woman-optimal stable marriage 



In the analysis of Algorithm|5] we already saw that Subramanian's update rule works when an entry of Jl is set 
to 1. When Algorithm[5]sets an entry to 0, say M t +\{m,Tii{m)) = 0, it is due to W t {jij{m),rri) = for some j < i- 1. 
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If j = z - 1, then Subramanian's algorithm will also set Jlt+i (m, nt [m)) = 0. Otherwise, Subramanian's algorithm 
will set J{ t+ i{m,n j + i{m)) = 0, and this zero will propagate, so that Jlt+i- j{m,n ,(m)) = 0. This shows that in 
some sense, Subramanian's algorithm mimics Algorithm \5\ It remains to show that Subramanian's algorithm 
computes the same final Jl and W as Algorithm[5] 

First, we notice that the termination conditions of both algorithms are really the same. For Subramanian's 
algorithm, the conditions are that for i > 1, 



m,7ii{m)) = Jt{m,Tii-\{rri)) A W {ni-\{m),m), 

(7.2) 

W{W,7li(lV)) - W(lV,7li-i{w)) V Jt{jli-\{W), W). 

For Algorithm[5] the termination conditions are 

M{m,Tii{m)) - M{m,ni-\{rrij) A f\ W{n j{m),m), 

j<i-i 

(7.3) 

W{w,iti{w]) = W{lV,1li-i{w)) v Y Jl{llj{w),w). 

j<i-l 

► Lemma 40. The matrices Jl, W at the end of Subramanian's algorithm satisfy the termination conditions of 
Algorithm^ and vice versa. Moreover, these are always matrix representations of intervals. 

Proof. We observe in both algorithms the update rules guarantee that Jl{m,-) is monotone non-increasing and 
that W{w,-) is monotone non-decreasing, which implies 

Jl{ni-\{w), w) = \/ Jl{n j(w),w), W{jti-i{m),m) = f\ W{jtj{m),m). 

;<;-l ;</-l 

Thus, these two termination conditions are equivalent. 

We call a pair of matrices {Jl, W) a feasible pair if they satisfy the equations in 17.21 or equivalently in !7.3l o, 
and furthermore Jl(m,n\{mY) = 1 and W(w,7ii{w)) = for all man m and woman w. The following lemma 
shows that, in some sense, Subramanian's algorithm is admissible. 

► Lemma 41. Let Jl, W be the matrices at the end of Subramanian's algorithm. If Jl(m, w) — c ^ * for some 
m, w, then Jl'{m, w) = c for any feasible pair [Jl ,W) . Same forW . 

Proof. The proof is by induction on the time t in which Jl t {m, w) is set to c. If t - 0, then the claim fol- 
lows from the definition of feasible pair. Otherwise, for some i we have Jl t {m,7ti(m)) = Jl t -\{m,7ii-\{m)) A 
W t -\{Tii-\{m), m). If c = 1 then Jl t -\{m,Tii-\{m)) = Wt-\[ni-\{m), m) = 1, and by induction these entries get the 
same value in all feasible pairs. The definition offeasible pair then implies that Jl'{m,Tii{m)) = 1 in any feasible 
pair Jl' , W . The case when c = can be shown similarly. -4 

The following lemma shows that we can uniquely extract a stable marriage from each 0/ 1-valued feasible pair, 
i.e. both of the matrices in the pair have 0/1 values, and vice versa. 

► Lemma 42. Suppose {Jl ,W) is a0/l-valued feasible pair. If we marry each man m to mm m {w : Jl [m, w) = 1}, 
and each woman w to min^fm : W[m, w) = 0], then the result is a stable marriage. 

Conversely, every stable marriage P can be encoded as a Oil-valued feasible pair, as follows: For each man 
m, we put Jl{m, w)-lifw> m P{m), andJl{m, w) = otherwise. For each woman w, we putW{w, m) - if 
m> w P{w), andW{w, m) = 1 otherwise. 

The two mappings are inverses of each other. 



Proof. Feasible pair implies stable marriage. Suppose [Jl,W) is a 0/ 1-valued feasible pair. We start by showing 
that the mapping P in the statement of the lemma is indeed a marriage. 

We call a person desperate if he or she is married to the last choice in his or her preference list. 
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If a man m is not desperate then for some 1 < i < n, Jl{m,jii{m)) = 1 and Jl{m,Hi + i{m)) = 0. This can only 
happen if W{n m) = 0, and furthermore if m > ni (m) m '> then W{ni{m), m!) = 1 due to Jl{m,Tii{m)) = 1. This 
shows that whenever a man m is not desperate, m is married to someone in the marriage P. 

If m is desperate and W{ni{m), m) = 0, then m is married as before. Otherwise, no woman is desperate, and 
so similarly to the previous argument, it can be shown that every woman w is married in P. However, the fact 
that P{w)^ m for all women w contradicts the pigeonhole principle. Thus, we conclude that P is a marriage. 

It remains to show that P is stable. Suppose that {m, w) were an unstable pair in P. Then Jt{m, w ) = 1 and 
W{ w, m) = 0, and moreover Jl{m, w') = 1 for some w' < m u>. Yet 17.31 shows that Jl{m, w') < W(w, m), and we 
reach a contradiction. 

Stable marriage implies feasible pair. Suppose m is matched to n^m). Consider first the case i < k. Then 
J{{m,Tii{rri)) - Jl{m,n i-\{m)) = 1, and we have to show that 1V{ni-\{m), m) = 1. If the latter weren't true then 
{m,ni-\ (m)) would be an unstable pair, since ni-\{m) > m 7r,(m) while W{ni-\{m), m) — implies that ni-i (m) 
prefers m to every other man which is matched to her. If i = fc+l then^Cm, JTj(m)) = and also W{Tti-\(m), m) = 
0. If i > k then J£{m,ni[m)) = M{m,7ii-\{m)) = 0. 

The mappings are inverses of each other. It is easy to check directly from the definition that if we start with a 
stable marriage P, convert it to a feasible pair {M,W), and convert it back into a stable marriage P', then P — P'. 

For the other direction, Lemma l40l shows that if {Ji,W) is a 0/1-valued feasible pair then for each man m, 
Jt{m,-) consists of a positive number of Is followed by 0s, and dually for each woman w, W{w,-) consists of a 
positive number of 0s followed by 1 s. Thus, given the fact that our rule of converting {M, W) to a stable marriage 
P indeed results in a marriage, it is clear that converting P back to a feasible pair results in {Jl, W). 

► Lemma 43. Subramanian's algorithm returns the man-optimal and woman-optimal stable marriages. Fur- 
thermore, the matrices .<{ ,W at the end of Subramanian's algorithm coincide with the matrices Jl, IV at the end 
of Algorithm^ 

Proof. The monotonicity of A and v shows that if we replace every * in ..({,¥ with 0, then the resulting {Jl, TV) 
is still a feasible pair; the same holds if we replace every * with 1. 

Lemma [41] and Lemma [42] together imply that the first output is the man- optimal stable matching, and the 
second output is the woman-optimal stable matching. Lemmal40lshows that at termination, the matrices Jl, W 
are matrix representations of intervals, hence they must coincide with the matrices at the end of Algorithm[5] < 

► Lemma 44. Subramanian's algorithm terminates after at most2n 2 iterations. 

Proof. Since there are 2n 2 entries in both matrices, and at each iteration at least one entry changes from * to 
or 1, the algorithm terminates after at most 2n 2 iterations. ■< 

A formal correctness proof of Subramanian's algorithm can be found in [11] . 

7.3.5 MoSm and WoSm are AC many-one reducible to Ccv 

In the remaining section, we will show that Subramanian's algorithm can be implemented as a three-valued 
comparator circuit. 

First, since for each man m, the pair of values J£ t {m,Tii-\ [m]) and W t {Tii-\ (m), m) is only used once to com- 
pute the two outputs Jl t [m,n A W t {n i-i[m) , m) and J£ t {m,n ;_i(ra)) v W t {ni-i{m),m), and then each 
output is used at most once when updating Jl t +\[m,Tii[m)) and W t+ \{m,ni{m)). Thus the whole update rule 
can be easily implemented using comparator gates. 

Second, we know that the algorithm converges within 2n 2 iterations to a fixed point. Therefore, if we run 
the loop for exactly 2n 2 iterations, the result would be the same. Hence, we can build a comparator circuit to 
simulate exactly 2n 2 iterations of Subramanian's algorithm. 

Finally, we can extract the man-optimal stable matching using a simple comparator circuit with negation 
gates. Recall that the logical values 0, * , 1 are represented in reality by pairs of wires with values (0, 0), (0, 1), (1, 1). 
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In the man-optimal stable matching, a man m is matched to 7r,(m) if Jl{m,Tti{m)) = 1 and either i = n or 
M{m,ni + i{m)) e {0, *}. In the latter case, if the corresponding wires are [a,p) and (y,5), then the required in- 
formation can be extracted as a A f> A -17. 

► Theorem 45. MoSm and WoSm are AC many-one reducible to Ccv-i. 

Proof. We will show only the reduction from MoSm to Ccv-i since the reduction from WoSm to Ccv-i works 
similarly. 

Following the above construction, we can define an AC function that takes as input an instance of MoSm 
with preference lists for all the men and women, and produces a three-valued comparator circuit that imple- 
ments Subramanian's algorithm, and then extracts the man- optimal stable matching. ■< 

Corollary[28]and Theorems l3ni30l andl45lgive us the following corollary. 

► Corollary 46. The ten problemsMoSM, WoSm, Sm, Ccv, Ccv-i, Three-valued Ccv, 3Lfmm, Lfmm, 3vLfmm 
and vLfmm are all equivalent under AC many-one reductions, where the equivalence of Sm is with respect to the 
search problem version of the reduction defined in Section [2H[ 



Proof. Corollary [28] and Theorem |3T1 show that Ccv, Ccv-i, Three-valued Ccv, 3Lfmm, Lfmm, 3vLfmm and 
vLfmm are all equivalent under AC many-one reductions. 

Theoremf45]shows that MoSm and WoSm are AC many-one reducible to Three-valued Ccv. Theoreml30l 
also shows that 3Lfmm is AC many-one reducible to MoSm, WoSm, and Sm. Hence, MoSm, WoSm, and Sm 
are equivalent to the above problems under AC many-one reductions. -< 

A Simplified proof of NL Q CC 

Each instance of the Reachability problem consists of a directed acyclic graph G = ( V, E) , where V - {uq,..., u n _i }, 
and we want to decide if there is a path from uq to u n -\. It is well-known that Reachability is NL-complete. 
Since a directed graph can be converted into a layered graph with an equivalent reachability problem, it suffices 
to give a comparator circuit construction that solves instances of Reachability satisfying the following assump- 
tion: 

The graph G only has directed edges of the form (w,-, uf), where i < j. (A.l) 

The following construction from [TTJ for showing that NL £ CC seems more intuitive than the one in [T6l[T2l . 
Moreover, it reduces Reachability to Ccv directly without going through some intermediate complete problem, 
and this was stated as an open problem in 16 Chapter 7.8.1]. 

We will demonstrate the construction through a simple example, where 
we have the directed graph in Fig.[13]satisfying the assumption IA.lt . We will 
build a comparator circuit as in Fig. [14] where the wires vq, . . . , V4 represent 

the vertices uq, . . . , U4 of the preceding graph and the wires to, . . . , L4 are used * Uz ■ 

to feed 1-bits into the wire vo, and from there to the other wires v,- reachable uq U3 

from vo. We let every wire i, take input 1 and every wire v,- take input 0. u ^ 

We next show how to construct the gadget contained in each box. For a 
graph with n vertices [n — 5 in our example), the A; th gadget is constructed as Figure 13 
follows: 
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Introduce a comparator gate from wire 1^ to wire vq 
for i = 0, ...,n— 1 do 

for j = i + 1, ...,n- 1 do 

Introduce a comparator gate from v, to V; if uj) e E, or a dummy gate on v,- otherwise. 

end for 
end for 
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Note that the gadgets are identical except for the first comparator gate. 

We only use the loop structure to clarify the order the gates are added. The construction can easily be done 
in AC since the position of each gate can be calculated exactly, and thus all gates can be added independently 
from one another. Note that for a graph with n vertices, we have at most n vertices reachable from a single vertex, 
and thus we need n gadgets as described above. In our example, there are at most 5 wires reachable from wire vo, 
and thus we utilize the gadget 5 times. 



'i- 

'2- 
t3- 
'4- 
V - 
Vl- 

v 2 - 

V 4 - 



Figure 14 A comparator circuit that solves Reachability. (The dummy gates are omitted.) 



Intuitively, the construction works since each gadget from a box looks for the lexicographically first maxi- 
mal path starting from vq (with respect to the natural lexicographical ordering induced by the vertex ordering 
vo,..., v n ), and then the vertex at the end of the path will be marked (i.e. its wire will now carry 1) and thus 
excluded from the search of the gadgets that follow. For example, the gadget from the left-most dashed box in 
Fig.[T4lwill move a value 1 from wire to to wire vo and from wire vo to wire v\ . This essentially "marks" the wire V\ 
since we cannot move this value 1 away from V\ , and thus V\ can no longer receive any new incoming 1. Hence, 
the gadget from the second box in Fig. [141 will repeat the process of finding the lex- first maximal path from vq to 
the remaining (unmarked) vertices. These searches end when all vertices reachable from are marked. Note 
that this has the same effect as applying the depth-first search algorithm to find all the vertices reachable from vq. 
Thus, the following theorem follows from the above construction. 

► Theorem 47 (Feder [12]). NLcCC. 
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